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One of the perennial problems of 

educational statistics is the interpre- 
tation of the coefficient of correlation 
between two variables. This is, to be 
sure, less of a problem for the limited 
few who (through good fortune or other- 
wise) have had extensive experience with 
all degrees of correlation, both high 
and low. And yet, even in this excep- 
tional group, one occasionally finds a 
misunderstanding or misstatement which 
can only be considered amazing. Thus, @ 
reputable statistician of the last dec- 
ade has written that a correlationof .75 
between two variables means a closeness 
of correspondence equal to "about 75 per 
cent of perfect interdependence" (6, 
p- 32). On no reasonable grounds is this 
interpretation of a correlation of .75 
generally defensible (7, pp. 264-266; 14, 
14). 

Misinterpretation of the corre- 
lation coefficient is more likely to oc- 
cur when the complicating factor of er- 
rors-—of-measurement enters in. Suppose, 
for example, that it is desired to pre- 
dict a young person's aptitude for, let 
us say, medical research. In attacking 
this problem, a quantitative criterion 








of success in medical research would 
typically be set up; tests of ability 
(and possibly personality) would be de- 
vised to predict the criterion; and the 
multiple correlation between the tests 
and the criterion would be obtained. 
Even if this multiple correlation were 
as high as .75, the "index of forecast- 
ing efficiency"~ (7, pp. 268-271) would 
be something less than 35 per cent. This, 
of course, represents an unsatisfactory 
state of affairs, especially since the 
multiple correlation between a test bat- 
tery and a criterion is commonly below 
75. 

At this point it is, however, im- 
portant to remember that the correlation 
between a test and a criterion is ordi- 
narily the correlation between the test 
and a decidedly fallible criterion. The 
low correlation between a test and 4 
criterion may obviously arise not merely 
through defects in the test, but also 
through defects in the criterion. It is 
desirable, therefore, to know not only 
the index of forecasting efficiency for 
the raw correlation between the test and 
the criterion (i.e., Egy, where the let- 
ter E stands for "efficiency", the 
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and Miss Ruth H, Krause for reading and criticism of the manuscript. 
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putation of Table 1, we are indebted to Mrs. Lina Hutson Aylesworth. 


2. The "index of forecasting efficiency" is 1 - 


i - rf 


cx: Tne radical yl1- rz, (termed the "co- 


efficient of alienation") is the generally accepted measure of the degree to which test x has 
failed to predict the criterion, c; the quantity 1 - V 1- res or E,,, states the degree to 
which test x has succeeded in predicting the criterion, c. 
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subscript "c" stands for the criterion, 
and "x" for the test); it is desirable 
to know also the index of forecasting ef- 
ficiency for the correlation between the 
test and the "true" criterion--i.e., the 
criterion freed from random errors of 
measurement. We want, in other words,not 
Ecx, but Ec x (where the subscript "c,." 
stands for a "true* score in the criteri- 
on; and "x", as before, stands for the 
fallible text).5 Im general, the effi- 
ciency of a test in predicting a true 
criterion (Ec_x) is greater than its ef- 
ficiency in predicting the fallible. cri- 
terion (Ecx); for in the former case, 
random errors of measurement in the cri- 
terion are balanced out, and this source 
of discrepancy between test-scores and 
criterion-scores is thus eliminated. The 
index, Eox is always greater than Ec,x, 
unless rco,c, (the reliability coefficient 
of the ert terion) be 1.00; but this of 
course never happens in actual practice. 
The value of the index of fore- 
casting efficiency in the case of a true 
criterion (Ecq,x) may be quite simply ob- 
tained, as follows. By definition, 


ie} . 
Co** orl - \y2 - £2 x 
1 @Q 


E S 2 ame 
Cx Ce 
@ 
The value of ré x may most conveniently 
be calculated from the formula, 


2 
r2 = Tox . 
ae 
a Teice 


where Tec, is the Spearman-Brown reli- 








ability coefficient of c, obtained by 
correlating "split halves* and then ap- 
plying the Spearman-Brown formula. This 
formula is given (in somewhat different 
notation) by Kelley (8, p. 201). Sub- 
stituting this value of rR in the defi- 
nition of Eo x above, we en -- 








When, by supposition, the criterion is 
perfectly reliable (1.e., reie, = 1.00), 
formula (1) reduces, as it should, to 


1-yl- 2 or Eoys the conventional 


"index of forecasting efficiency." The 
use of formula (1) is, of course, re- 
stricted to the cases where the Spearman- 
Brown technique is considered applicable, 
(Alternative methods for the evaluation 
ofr and the determination of Eco x, 
are presented in the Supplementary Note 
at the end of the present paper.) 

It is of some interest to compare the 
function in formula (1), 


2 
Tox 





7 Te1Ceg 
(or as we may symbolize it, k, x), with the more 
familiar "coefficient of alienation", V1 - r2, 
(or kgy)- The two functions are identical, ex- 
cept that in the former, r2, is divided by the 
decimal quantity, Teyce* It is well known that 
the value of 


2 
1 - Fox 





3. 


should be considered final and absolute. 
rather than scholastic aptitude for college. 
is the sole interest, Ec,x and not Ec 


It is not difficult to imagine cases where the unreliable criterion, despite its unreliability, 
Thus, one may wish to predict actual college grades, 


In such a case, if the prediction of actual grades 
is the proper measure of predictive efficiency. 





To the 


writers, however, it appears indefensible to limit one's interest and effort to this one prac- 


tical problem. 


If, for example, the college grades were seriously unreliable for the purpose 
at hand (as would be shown by a wide discrepancy between 


to eliminate the unreliability and attendant unfairness of 


and Ecx), steps should be taken 
e grades. 


Besides random errors of measurement in the criterion, reflected in the reliability co- 


efficient, there are, of course, various systematic errors. 


fect individual scores unequally, also impair the validity of a criterion. 
rection for such systematic errors is attempted in the present paper. 


These, to the extent that they af- 
No statistical cor- 
A valid correction for 


systematic errors would probably require more information concerning the nature and intensity 


of these errors than is generally available. 
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drops very slowly for changes in r,, from, say, 
.00 to .30, and very rapidly for changes in r,, 
from, say, .95 to 1.00 (2, p. 299). The func- 


2 
| Tex 
tion, | - ’ 
Teic 
1°2 


in even more pronounced form, since the effect of 
division by the decimal quantity, Teice? is to 
increase the higher values of r,, more than the 
lower ;4 and this accentuates the already pro- 
nouncedly differential rate of decline of ke, 
for increasing values of ro,;. 

Table 1 gives values of the in- 
dex of forecasting efficiency (with cor- 


shows the same behavior, but 


rection for random errors in the criteri- | 


on), for various values of rey (correla- 
tion between test and criterion) and 
Te,c, (reliability of criterion).5 The 
first column of Table 1 (in which rojo, 
is taken as 1.00) gives values of the in- 
dex of forecasting efficiency when (by 


| 





supposition) random errors do not occur 
in the criterion-measurement at all. A 
glance along the rows of the table, from 
this first column, serves to indicate 
the influence of unreliability of the 
criterion upon values of the corrected 
index of forecasting efficiency, for 
given values of rex. A glance down the 
columns of the table serves to indicate 
the effect of changes in rey upon values 
of the corrected index of forecasting 
efficiency, for a fixed value of roc, 8 
It may be noticed, in the table, that no 
data are included for values of re.e¢ 
below .30; it was felt that tests with a 
reliability below .30 are of hardly any 
practical or theoretical interest.’ Be- 
tween .30 and .60, r..q¢, is tabled in 
intervals of .05; between .60 and .980, 
in intervals of .01; between .980 and 





4, Thus, when roy is .50 and Te ce is .80, So = 


Ree This may be considered a comparatively modest rise. But 


Tex _ -56 


.56 and 
- -80 


ee i 
before) .80, then ro, = wd 


or .70--a rise of 


5. Table 1 was independently computed by both authors. 


-25 and = 


1.000, in intervals of .005. The values 
- aie 


r 
= -88 or .3l--a rise of .06 above 


Teices -80 
when roy = .75 and Toice is (as 


14 above r2,. 


The first method of calculation made use of 


formula (1), the value of the expression within the radical, 


2 
i 1 Tex, 


being computed correct to six decimals, and the square root obtained correct to three decimals 


with the aid of Barlow's Tables (1). 


E zi- bd r 
Cx ox? Fox 





yi-r 


C1Ce2g 


The second method of computation employed the formula, 
being computed by the formula 


Tox 


r . eee 
all YI Teic, 


and V 1 - re 
use of Miner's Tables (11). 


being then obtained (without interpolation for the fifth decimal in r 


Gx) by the 


These two methods of computation could not be expected to yield ex- 


actly the same answer in every instance, since the absence of interpolation in the second method 


occasionally introduces a slight error. 


All such discrepancies by the two methods were, of 


course, investigated and resolved. The tables, as published, should be correct to the number of 


decimals given. 


The blank spaces in Table 1 require a word of explanation. 


The correlation, r,, cannot (except 


by chance) exceed the correlation between actual scores in the criterion and true scores in the 
criterion. This maximum correlation (symbolized as Toc ) is termed the "index of reliability"; 
@ 


numerically, it is equal to // Tose (5, pp.272-275). In Table 1, the first blank space which 


occurs refers to the combination, 


Pc) cg = -995, roy = 1.000). 


Now, when r,_,. = 995, the 


highest possible value of r,, is .995 or .997; the combination in the table, therefore (in 


which rox = 1.000 and re.¢, = 


-995) is impossible, or imaginary. Hence, the value of Eq x has 


not been computed for this combination. Indeed, if one attempted to compute Eo x for this com- 


bination, one would obtain the value 


| 2 
1- tu = yl - 1.005 = \/-.005, 


which is an imaginary number. 
table. 


A similar explanation applies to all the blank spaces in the 


7. Professor T. L. Kelley, for example, has recently suggested (in connection with the reliability 
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of rox are also tabled in terms of a 
somewnat similarly varying interval, ex- 
cept that values of rey are included 
down to .00. 


Table 1 serves to emphasize cer- 


tain facts which, at times, seem insuf- 
ficiently appreciated. It has for some 
time been generally realized thataslight 
change of r in the region of, say, rex 

= .95, is quite significant--much more 
Significant than a numerically equal 
change in the region of, say, fcx=.30. 
But it does not seem to have been equal- 
ly well realized that, when re,c, 18 
(shall we say) .60, a change in roy from 
-76 to .77 (if based on sufficient cases 
to be reliable) would signify a greater 
improvement in the corrected index of 
forecasting efficiency, than a change in 
Tex from .98 to .99 when re,c, = 1-00. 


(As shown in Table 1, the change in pre- 


dictive efficiency in the former case is 
from .807 to .891, in the latter case, 
from .801 to .859.) In short, the re- 
gion of rey where a slight change in nu- 
merical values becomes significant, is 
determined not alone by rex, but also-- 
and to an important degree--by re,c,, the 
reliability of the criterion. This ef- 
fect is particularly noticeable in con- 
nection with the higher values of roy, 
and also in connection with the very low 
values of re,c,- Iilustrations: When 
Te,c, 18 +55; the change in rey from .58 
to .59 signifies more improvemement in 
the corrected index of forecasting effi- 
ciency, than the change in rey from .985 
to .995 when inn * 1.00. (The change 
in efficiency in the former case is from 
-803 to .926, in the latter case is from 
-827 to .900.) When re,g, is .81, the 
change in rey from .89 to -90 signifies 
slightly more improvement in the correct- 
ed index of forecasting efficiency, than 
the change in r,, from .990 to 1.000 when 
‘a2, * 1.00. (The change in efficiency 
in the former case is from .851 to 1.000, 
in the latter is from .859 to 1.000.) 
Another fact in Table 1, worthy 
of separate and explicit mention, is the 




















varying effect of a given difference in 
I'c,c,g upon different levels of rey. If 
one examines along the rows of Table 1 
in the region of the higher values of 
Tox (say around .80), one may be im- 
pressed by the definite increase in the 
value of Ec x as the value of re,¢ de- 
creases (rey itself remaining constant). 
Evidently, then, when rey is high, low 
reliability of the criterion is a sig- 
nificantly limiting factor upon the prac- 
tical efficiency of test-prediction. But 
in the region of lower values of rey 
(say around .30), a glance along the rows 
of Table 1 shows that, while the values 
of Ec x do rise somewhat as ro.c, de- 
creases, the rise is quite small. For 
low values of rex, then, changes in the 
reliability of the criterion influence 
the predictive efficiency of a test _ so 
slightly, as to be practically negligi- 
ble. This is not to say that, with a 
low value of rey, improvement in the 
validity or nature of the criterion will 
fail to be of any service; it is merely 
to emphasize that--unless the reliabil- 
ity of the criterion is extraordinarily 
low--very poor predictive efficiency in 
a test is not attributable to random er- 
rors of measurement in the criterion. 

The first column of Table 1 gives values 
of Eq x when ro,c, of formula (1) equals 1.000; 
the values in this column, as previously stated, 
are identical with the values of Ecy. The other 
values of E in Table 1, to the extent that 
they differ from Ecx and are based on fallible 
values of ry and Te,co» are, of course, merely 
estimates; these values of E state the fore- 
casting efficiency that would occur, if rex and 
Te,cg Were exactly as found in the given sample, 
and if the reliability of the criterion were 
lifted from its sample-value to unity. The 
larger the difference between E and E.,, the 
greater is the departure from actual empirical 
finding, and the greater, correspondingly, should 
probably be the caution of interpretation. A 
large difference between E and Ecy will, in 
practice, generally arise from a low value of 
the reliability coefficient, rc,c,.- The conse- 
quence of a low reliability coefficient is, in 























(Footnote continued) of a certain test in the ninth grade) that, "to yield a serviceable group 


test, the reliability should be .40 or better" (10, p. 300). Previously, Professor Kelley had 
Suggested .50 as the lower limit of acceptability for the reliability of a test in a single 


school grade (9, p. 211). 
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the first place, an increase in the sampling er- 


ror of E,. x; this may, in a sense, be experimen- 
tally compensated for, simply by including suf- 
ficiently more cases in the sample. A second 
difficulty arising from a low reliability coef- 
ficient is the increased difficulty of interpre- 
tation of Eo x: The lower the reliability co- 
efficient of the criterion, other things being 


equal, the greater will be the uncertainty as to 


whether or not the criterion suffers not only 
from random errors, but also from various un- 
known systematic errors. Since the effect of 


such systematic errors may be either to raise or | 
lower rox from its proper value, the tabled val- | 


ues of E in such a case may be either too 
high or too low. The point we wish to make 
that the availability of a correction for the 
unreliability of the criterion should certainly 
not be taken as a quite satisfactory substitute 
for actual high reliability; and given high re- 
liability, Ec x will be close to Ecx. 

Figure 1, on the following page, 
presents, in graphical form, the value of 
Eo,x corresponding to rey, when 
equals, respectively, .40, .50, 
.80, .90, and 1.00.8 The heavy 
Figure 1 (giving values of Ecx When 
Te,c, = 1-00) is identical with the curve 
of the uncorrected index of forecasting 
efficiency, Ecy. Comparison between this 
heavy curve and the others will serve to 
emphasize two facts already indicated 
above: (1) The lower the reliability of 
the criterion (re,c,), the greater the 
difference between the corrected and the 
uncorrected index, for a given value of 


is 


Te1C, 


curve in 


Tex; and (2) the higher the value of rey, | 


the greater the effect of a small change 
in fex upon the value of Ec x, for a 
given value of rc,c,- Both these trends 
are more conspicuous for the higher val- 
ues of rey, and negligible for very low 
values of rox. 

The use of Table 1 presupposes 
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precise values of rex (the correlation 
between the test and the criterion), and 


| also of ro,c, (the reliability coeffi- 


cient of the criterion). Too often, the 
reliability of the criterion is not ex- 
In such a case, the only 
recourse is to estimate the value of 
Te,cg» and to use the resulting value of 
Ec x with due discretion. Certainly it 


is better to do this, than to make the 
| major blunder of calculating the index 


of forecasting efficiency as if the re- 
liability of the criterion were really 
1.00. 

In notices and advertisements of 


| educational tests, one frequently runs 
| across such a statement as, "The test 


correlates with teachers' marks about 4s 


| well as the marks correlate with them- 
| selves." 
| that such a test is about as good as any 
| that could be constructed: 
| question is implied) how could a test be 
| expected to correlate higher with grades, 
GD, «TO, | 


The implication seems to be 


for (the 


than the grades correlate with them- 
selves? This implication we consider 
misleading. Certainly we may legitimate- 
ly ask that a reliable test, designed to 


| supply an adequate measure of achieve- 


ment in a given subject, should corre- 
late significantly higher with grades in 
the subject, than two unreliable grades 
correlate with each other. As a matter 
of fact, reference to Table 1 shows that, 
when Tox equals, say, .65, and ro.o, 
equals .70, the index of forecasting ef- 
ficiency (even after correction for ran- 
dom errors of measurement in the criteri- 
on) is only .370 or 37.0 per cent.? This 
can hardly be considered a degree of ef- 
ficiency worth boasting of, certainly 
not for individual guidance. With the 
reliability of grades equal to .70, the 

| correlation between the test and grades 

















8. In practice, a value of 1.00 for re,c, would probably never occur, any more than would the value 


Pex = 1.00, or E = 1.00. 
their interest as theoretical upper limits. 
9. 


page 232), this figure is too low. 


All these values are, however, included in Figure 1, because of 


To the extent that the criterion contains systematic errors affecting r,, (cf. footnote no. 1, 
Even so, however, the difference between 57.0 per cent and 


perfection is far too great to be assigned wholly to systematic inadequacies of the criterion. 
Conservative usage suggests rather that the discrepancies between test- and criterion-scores, 
persisting after correction for unreliability of the criterion, should in general be assigned 


mainly to inadequacies of the test. 
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CORRELATION BETWEEN TEST AND CRITERION (2x) 


Figure l. 


Plot of the corrected index of forecasting efficiency (Eo x) for 


stated values of the reliability coefficient of the criterion (To,c,)- 


would have to equal .72, for the cor- 
rected index of forecasting efficiency 
(Ec ox) to equal 50 per cent; it would 
have to equal .82 for the corrected in- 
dex of forecasting efficiency to equal 
80 per cent. Looking at Table 1, we can 





find but little comfort in the supposed- 
ly reassuring statement that "the test 
correlates about as well with grades as 
the grades correlate with each other"-- 
unless, to be sure, the self-correlation 
of the grades themselves is uncommonly 


high. 
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On the other side of the picture, 
4t is clear that the thoughtless or rou- 
tine use of the uncorrected index, Eo,, 
instead of the corrected index, Ec x, 
tends to underestimate the true worth of 
tests. For example, when rey equals .70 
and Toic, equals .75, then the uncor- 
rected index, Ecy, equals only .286, or 
28.6 per cent; corrected for random er- 
rors of measurement in the criterion, the 
index (Ec, x) becomes 41,1 per cent--a 
definite improvement over the uncorrect- 
ed index. Additional illustrations of a 
similar sort can easily be found in Ta- 
ble l. 


SUMMARY 


A low correlation between a test 
and a criterion may be due to inadequa- 
cies of the test, inadequacies of the 
criterion, or both. It is frequently de- 
sirable to know the index of forecasting 
efficiency of a test, after correction 
for random errors of measurement in the 
criterion. A formula is given by which 
such a correction may be effected; in ad- 
dition, a table is presented of values of 
Ec x (the corrected index of forecasting 
efficiency), for various values of rox 
(correlation between test and criterion) 
and of re_¢, (reliability of criterion). 
The table serves to emphasize the follow 
ing facts: 

1. The region of rex where a 
slight change in correlation becomes sig- 
nificant is determined not alone by rey, 
but also, to an important degree, by 
Teice* Thus, with rc,c, equal to .60, a 
change in roy from .76 to .77 would sig- 
nify a greater improvement in the cor- 
rected index of forecasting efficiency 
(Eg.x)» than a change in rox from .98 to 
-99 when T¢eic, = 1.00. 


2. When roy is high (around .75 
or .80, low reliability of the criterion 
is a significantly limiting factor upon 
the efficiency of prediction by a test. 
But when roy is low (around .30), changes 
in the reliability of the criterion only 
negligibly affect the predictive effi- 
ciency of a test. The unreliability of 





} 


a criterion would have to |. extraordi- 
narily low before one could legitimately 
attribute the very low predictive effi- 
ciency of a test to random errors of 
measurement (i.e., unreliability) in the 
criterion. 

3. An educational test which 
"correlates with grades about as well as 
grades correlate with each other", may 
still not possess a satisfactory degree 
of predictive efficiency. Thus, if 
Te,c, (reliability or self-correlation 
of grades) equals .70, and roe, (correla- 
tion between test and grades) equals .65, 
the index of forecasting efficiency-- 


| even after correction for random errors 
| of measurement in the criterion--is only 


| 37.0 per cent. 


| 1. Comrie, L. J., editor: 





Consideration of system- 
atic (as distinguished from random) er- 
rors in the criterion, affecting roy, 
may justify raising somewhat this figure 
of 37.0 per cent--but hardly enough to 
reach a degree of predictive efficiency 
that could reasonably be called satis- 
factory. 

4. The efficiency of tests con- 
structed for prediction-purposes, while 
still undoubtedly lower than desirable, 
is nevertheless greater than would be in- 
ferred by use of the more commonly quot- 
ed (but not generally more valid) formu- 


la, 
Ecx = 1 - y] 


Figure 1 illustrates this fact, and Ta- 
ble 1 gives it a detailed, quantitative 
expression. 


1 _ r2 . 
cx 
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SUPPLEMENTARY NOTE 


The special assumption of formula (1) is 
that two "comparable" measures of the criterion 
are available--the term "comparable" being de- 
fined to include (a) equal correlation with the 
criterion, (b) equal reliability, and (c) equal 
standard deviations. In psychological and edu- 
cational work, the two comparable measures will 
ordinarily consist either of scores from "split 
halves", scores from alternate forms, or scores 
from re-measurement. A more general formula 
than (1), not requiring the assumption of com- 
parability (but still requiring two measures of 
the criterion), may be written as follows (cf. 
reference 2, p. 30): 


E =1- 1 - 


In this formula, c, represents one measure of 
the criterion, and cg another; rc,c, i8 the cor- 
relation between the two measures. The only re- 
striction upon c, and cg in formula (2) is that, 
within the limits of random errors, both c, and 
Cg should be measures of the same thing; within 
this restriction, c, and cg may represent "split 
halves" of a Single test, or measurements by 
re-tests, or measurements by alternate forms, or 
measurements by differing techniques, etc. Form- 
ula (2), like all other formulas in this paper, 
assumes that errors of measurement are uncorre- 
lated with each other, and with the "true" meas- 
ures. 





Te, xTcgx (2) 


TeiCg 


Formula (2) is required whenever the two 
measures of the criterion, c, and ca, are not 
statistically "comparable." An illustration of 
this occurs when the criterion consists of rat- 
ings from, say, five judges. In such a case, it 
would be well nigh impossible to find one sub- 
group of judges whose average ratings are statis-— 
tically comparable with the average ratings from 
the remaining sub-group. 

The use of formula (2) is optional when 


- 





10, In formula (1), it will be recalled, Tec, Stood not for the correlation between two split 


halves, but for the Spearman-Brown reliability coefficient of the criterion. 
notation between formula (1) and formula (2) should be carefully noted. 


This difference in 
The meaning of formula 








(1) may obviously be extended to include any case where Teyce is the correlation between any two 


comparable measures of the criterion, and r,, is the correlation between the test, x, and either 
of the two comparable measures of the criterion. 





anes. 
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two experimentally independent and tolerably | formula (3) becomes especially significant, if 
parable measures of the criterion are available; | the criterion enters into multiple correlation 
for in this case, since the two measures are rea-| with a series of tests, instead of (as generally 
sonably comparable, the freedom of formula (2) | assumed in the present paper) merely into simple 
from assumptions loses most of its practical sig-| correlation with a single test, x. 
nificance. Presumably an advantage still re- | If the assumptions of the Spearman-Browm 
maining in formula (2) is that this formula, | formula are admissible, there is no advantage in 
th in its numerator and in its denominator, | using formula (2) when the two measures of the 
sakes full use of all the available data; this (| criterion are merely split-halves of a single ex 
should render the probable error of formula (2) | perimental measurement; for in this case, formu 
smaller than that of formula (1). But if two ex-| las (1) and (2) are equivalent. The proof is as 
perimentally independent measures of the criteri-| follows. By the Spearman-Brown assumptions, 


on are available, the term rex in the numerator | rc.x of formula (2) = Toox = Tex} hence formula 
of formula (1) may, without essential alteration | (2) may be written— — 
of the formula, be modified to re 
E x = 1 = \| 1 ial ag (4) 
4 % / FeyC, 


(= + vom 
; 
2 | where Tejc, is the correlation between the two 
and if this is done, then formula (1) makes just split halves. Now, writing formula (1) in nota 
as full use of the entire empirical data as form-| tion conforming to the notation in formula (2) 


ula (2). The use of the modified formula (1), | (cf. footnotes 1 on page 242, and 2 below), 
moreover, offers a practical economy, in that the | ————__—_—___—— - 
numerator, Tc,xT cox, of formula (2) calls for the| (¢ —e 
computation of two correlations between the cri- | E. x =1-\/1-——a—= 5) 
terion and x; whereas pe y (cy + ¢g)(cy + Cg) 
2 
e + so) | but ” 
2 a namaste ae = 





r 
may be calculated by the formula- mt oe iD +et 0&2 6. 
al ‘. Nox VO, Sc, Fey cg2c,[ cg 



































2 2 
= : 2) _ "(ey + Ca)x 4y 2icx , 
’ = _  ————— 
2 
2 2 No, \ 2% + 2ro. .0¢ 
1 + Toices 
} 

‘ 
which involves only a single correlation between | ‘ ONT cx0 00x = "cx 
the criterion-measures and x. With this modifi- NOLO, \2 =, . 1 + Tose 
cation of formula (1), suggested by the avail- —_ . 
ability of additional data, formula (1) may be Also, by the Spearman-Brown formula, 
rewritten— er 

r = C1Cg 
3 oa (ce, + Cg) (cy + cg) er. 
. Bee 3+) 12 ~scee (3) “aS 
j ” “Teics Hence 
i¢+? : ers a+? 
C1Cg T(c, + Cg)x . C1Cg 

1 F(c, + c,)(c, + cp) 1+ Feces "Tec 
where r~,;, Stands, here, for the correlation be- s 

tween the two experimentally independent measures — om 

of the criterion.l® the practical economy of Teicg 











1l. The derivation of this formula may be found (in a somewhat different notation) in reference 4. 
12. In formula (1), Teics stood for a Spearman-Brown reliability coefficient--cf. footnote 1 on 
| page 242. 
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Whence, substituting in (5), 2re ice 
and with T¥ io instead of Teics* 


Tec In this discussion, the advantage of 

| formula (1) has perhaps been lost sight of. The 

which agrees exactly with (4) above. chief advantage of this formula is that it calls 
Table 1 is of service whether formula for only one experimental measure of the criteri- 

(1), (2), or (3) is employed. If formula (2) on, thus reducing significantly the time and 

has been used, enter the table with cost of data-collectiopn. The price of this ad- 

vantage is a larger number of assumptions, and 


\f Toyxtogx ? 8 higher probable error due to the paucity of 
experimental data (12). 





instead of r,,. If formula (5) has been used, 
enter the table with rxz(c, + c,) instead of rox, 





A TABLE FOR COMPUTING BISERIAL *"r"1 


Laverne E. Kolbe and Harold A. Edgerton 
Occupational Research Program 
U. S. Employment Service 


The computation of any large nun- 
ber of biserial correlations is a time- 
consuming process. In order that the com 
putations may be speeded up, a table has 
been developed. A convenient arrangement 
of the formula was used: 


M; - M 
This =-ic-t. f 
t 


Mean of the continuous vari- 
able for one category of the 
dichotomized variable. 

= Mean of the continuous vari- 
able. 
Standard deviation of the 
continuous variable. 
Proportion of the total nun- 
ber of observations in the 
group from which M; is com- 
puted. 
Ordinate of the unit normal 
curve at point P. 


The table has been arranged to 
give r correctly to two decimals without 
interpolation. To enter the table one 
needs to know the value of p, and of 
(My - Mt) 

Ot 


For convenience, the latter 


quantity will be called A. It has been 
suggested by J. W. Dunlap that A is much 
easier to compute if the observations be 
transmuted so that the values My and 0,4 
be such that finding the difference Mj; 

- My and dividing by o,% can be done men- 
tally. Such values as My = 50 and o; 

= 10 are excellent for the purpose. This 
transmutation scheme is of particular 
value where a large number of biserial 
correlations must be computed using the 


same continuous variable, as in the case 
of validating test items against an out- 
side criterion or against the internal 
consistency criterion of total score. 
The table is arranged as follows: 

Columns are identified by the 
value of p. Rows are identified by val- 
ues of biserial r. The table entry is 
the greatest value of A which, for that 
particular value of p, can give the val- 
ue of r shown for that row. In the ta- 
ble, values of A are all given to three 
decimals. The decimals have been omit- 
ted to make the table more compact. Thus, 
a table entry of 1617 is a A of 1.617. 


Directions for using the Table: 








1. Compute the values p and A. 

2. Find column p. 

3. Go down in Column p until a 
table entry is found exactly 
as large as or just larger 
than A, 

The value of biserial r is 
read at the end of that hori- 
zontal row. 

5. Assign the same sign to r as 
is attached to A. 

Example: 

Assuming that Mt is 50 and o¢ is 
10, the value of My is found to be 43.62. 
From this, it is readily seen that A is 
-.638. My is based on 32 cases out of 
164, which is 20 per cent of tne cases. 
Find column p = 20 in the table. Follow 
this down until the value of A, exactly 
as large as .638 or just larger, is found, 
The value .636 is found so the value just 
larger (.650) is used. This is in row 
r= .46. Give r the same sign as A, in 
this case r = -.46. 





1. The table was constructed by the Statistical Unit of the Occupational Research Program to facil- 


itate the computation of biserial correlations. 
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ON CERTAIN ESTIMATED CORRELATION 


Edward E. Cureton 
Alabama Polytechnic Institute 


I. Comparable Tests versus 
Experimental Independence 


There are two methods, in gener- 
al, of arranging a testing program for a 
study in which intercorrelations and re- 
liability coefficients of a number of 
variates are to be obtained. The first 
and most common methods is to give all 
the tests at one sitting, or at least on 
the same day. The reliability of each 
test is obtained by correlating halves 
and applying the Spearman-Brown Formula. 
This is true whether we have two forms of 
each test or only one. In the former 
case, the score on Form A is one half, 
the score on Form B is the other half, and 
the total score on the test is the sum of 
the scores on Forms A and B. It is these 
total scores which are used in computing 
the intercorrelations. In the latter 
case, an effort is made to split the 
items into comparable halves, usually by 
grouping the odd items in one and the 
even items in another, or by taking items 
1,4,5,8,9, etc., as one form and items 
£,5,6,7,10,11, etc., as the other. In 
this case, as in the former, the corre- 
lation between the half-tests is ob- 
tained first, and the reliability of the 
total test is estimated by the Spearman- 
Brown Formula. The total scores are used 
in computing the intercorrelations. This 
first method of procedure implies a defi- 
nition of reliability and a set of as- 
sumptions regarding the tests and half- 
tests. The reliability implied in the 
definition might be designated "instan- 
taneous reliability." 

Any reliability coefficient may 
be considered as the ratio of the vari- 








FUNCTIONS AND THEIR STANDARD ERRORS 


b 
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ances (squared standard deviations) of 
theoretical measures of the true or un- 
derlying ability, to the corresponding 
test scores. In the case of "instan- 
taneous reliability", the true ability 
means the ability of the subject, inde- 
pendent of the error of measurement of 
the test, at the exact time when he is 
tested. The error of measurement is as- 
sumed to lie entirely in the test, and 
the two half-tests are taken as equiva- 
lent random samples of all possible sets 
of items measuring the same ability. The 
assumptions implied in this first method 
of procedure may be called collectively 
the assumption of comparability. 

In all measurements involving 
reliability coefficients the assumption 
is made that the errors of measurement 
in the two half-tests are uncorrelated 
with each other and with the underlying 
ability. The assumption of comparabil- 
ity means that in addition, the two half- 
tests will have equal units of measure- 
ment (though not necessarily the same 
zero-point), equal variances (but not 
necessarily equal means), equal reliabil- 
ities, and equal correlations with any 
other measure. The assumption of com- 
parability will in general be met when- 
ever two half-tests consist of an equal 
number of equally difficult similar 
items, similarly arranged. The equali- 
ties demanded are only approximate, i.e., 
equalities within the limits of the cor- 
responding sampling errors. 

The second method of arranging 
the testing program is to give the tests 
on two separate occasions. In order to 
use this method, there must be two sepa- 
rate forms of each test, but these forms 











1. A number of the standard errors here presented were first derived by Jack W. Dunlap and the writ- 
er working together. 


A number of these were published, but several more were not. 
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need not be comparable. All the Form A 
tests are given at one time and all the 
form B tests at another. If the inter- 
val between the two testing sessions has 
been well chosen, we will then have an 
approach to experimental independence be- 
tween Form A and Form B of each test. 

The reliability coefficient of each test 
will be the correlation between Form A 
and Form B. The intercorrelation be- 
tween two tests will be some average 
(usually the geometric mean) of two cor- 
relations: that between Form A of the 
first test and Form B of the second, and 
that between Form B of the first test 

and Form A of the second. Note that all 
correlations are between a Form A test 
and a Form B test. Correlations between 
tests given at the same testing period, 
such as that between Form A of each of 
two different tests, or between Form B 
of each of two different tests, are not 
computed, since they are not experimen- 
tally independent, having been obtained 
at the same testing period. 

This second procedure also im- 
plies a definition of reliability, which 
may be called "average reliability." The 
reliability coefficient is still the ra- 
tio of the variances of measures of the 
true ability to test scores. But the 
true ability now means, not the ability 
of the subject at the time tested (apart 
from errors of measurement in the test), 
but his average ability over any period 
long enough to include all types of 
short-time fluctuation, but short enough 
to preclude growth or decline of the 
ability. The short-time fluctuations in 
ability of the subjects have now joined 
the errors of measurement of the test, 
the true ability of a subject means his 
average true ability, and the reliabili- 
ty of a measurement includes both the re- 
liability of the test and the reliabili- 
ty of the subjects. To obtain true ex- 
perimental independence, the two testing 
periods must be separated by such an in- 
terval as to give two independent random 
samplings of the underlying abilities of 
the subjects. The average difference be- 
tween the abilities at the first and 





E. 


| ity of a maximum difference, 


| on the same day of the week, 
| should not be separated by exactly one or 





second testing periods must equal the 
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| median fluctuation of such abilities over 
| any short period (such as a 


few days or 
weeks), and the probability of a zero 
difference must be equal to the probabil- 
Certain 
practical suggestions would seemin point. 
The two testing periods should not come 
i.e., they 


two or any other number of weeks. If the 


| program of tests is not long, one period 


should probably come in the morning and 
the other in the afternoon. The tests 
should not be given in the same order 
(nor yet in exactly reverse order) at the 
two testing periods. At least one week- 
end should intervene between the two 
testing periods. If these suggestions, 


| and any others that may occur to the in- 
| vestigator in connection with any par- 


ticular testing program, are followed 
carefully, it is fairly probable that ap- 


| proximate experimental independence will 


be obtained. 
This second method of procedure, 


| as outlined above, has several advan- 


tages over the first. In estimating re- 
liability, correcting for attenuation, 
etc., the important problems usually con- 
cern the estimation of the average true 
abilities of the subjects rather than 
their instantaneous true abilities. 
more important than this is the fact 
that with experimental independence 
achieved, the need for comparability in 
general vanishes. Form A and Form B of 
each test, as long as they measure the 
same underlying ability with uncorrelat- 
ed errors, need not be equally long, nor 
equally reliable. They need not even 
measure in the same units, and they need 
not have equal variances nor equal cor- 
relations with other measures. The er- 
rors of measurement in any two or more 
Form A tests or in any two or more Form 
B tests may be correlated without any 
harm to essential assumptions, as long 
as the errors in all Form A tests are 
uncorrelated with the errors in all Form 
B tests. In a number of cases, these 
advantages may be obtained by using the 
methods of computation appropriate to 
the procedure without experimental 


But 
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independence being present. In such 
cases the assumption of instantaneous 
true ability will still be present, but 
the assumption of comparability will be 
avoided. These methods of computation 


are usually longer than the methods ap- 
propriate to the first procedure, but 
they should always be used whenever com- 


parability cannot be either demonstrated 
or assumed. 

















II. Estimated Correlation Functions of Comparable Tests 


The basic formula in this procedure is the Spearman-Brown Formula for esti 
mating the reliability of a total test from the correlation between its two com- 
parable halves. The various formulas to be considered involve intercorrelations 
between total tests and reliabilities of total tests estimated from half-test cor- 
relations. In order to distinguish clearly between whole-test scores, half-test 


scores, etc., a special system of notation is advisable. This is presented in the 
table below. 













Test l Test 2 Test 3 










"True" scores Xp Xy Xy xy 
Total scores Xi Xe X35 X4 
Scores, Form A X4 Xii Xiii Xiv 





Scores, Form B XT XII XIII XTy 












From this table it is readily seen that x, = xj + Xy, Kg = X;; + Xrqz» etc. 
Lr, Lre 


Sqm , Re 8 : 








We define r, = 





Tipo Te = Tay ry etc., and R, 















values R,, Rg, etc., are the Spearman-Brown estimates of the reliabilities of the 
total tests whose scores are Xj, Xz, etc. 


All scores are assumed to be measured 
from their respective means as origins. 








To obtain the standard errors of functions of intercorrelation coefficients 
and Spearman-Brown reliability coefficients, we require the sampling variances of 
the two types of coefficient and the sampling covariances of all possible types of 
pairs of coefficients. The sampling variance of any measure is the square of its 
standard error, and is designated by the symbol o” with a subscript identifying the 


measure. The sampling covariance of two measures is their correlation from sample 
to sample multiplied by their two respective standard errors. 
o with two subscripts to identify the two measures. 


The two sampling variances required are already available. 








It is designated by 








(1 - #is)" 





ofa « at". 


The first is Pearson's well-known formula for the standard error of any correlation 
coefficient computed directly from the data. The second was derived and published 
by Shen (1924). 









Of the sampling covariances required, two are also well known. These are 
the formulas for the sampling covariance of two correlation coefficients computed 
directly from the data, given first by Pearson and Filon (1898). 

1 


2 2 2 
aa Pe = 2 ((1 - Pas - Tis) (2ras - Pialis) + Tialisle3). (3) 
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1 _ 2 2 2 
¢c = 43° + ri4P + — Pyoela,4(r ° £ ae + £ 
Pan¥an isl24 i4T2s @ Tietse\Tis 14 23 24/ 


- (PialisTie + PislesTaa + TisTaslse + TieTaalsa)- (4) 


The other sampling covariances all involve Spearman-Brown reliability coeffi- 
cients. In order to obtain these covariances it is first necessary to note a few 
preliminary formulas, most of which have already been derived (Cureton and Dunlap, 


1930, A, B, C). These derivations are repeated for the convenience of the reader 
in Appendix I. 


(i + Ps oa 
Pia a ™ 1 a ions \5) 
V 
,/ i+ ri 
Tie TIe2 Tie \ ia on 


= a 2 2 
NO. re => Tia(l - r,)(1 - rg). 


1 e. 2 
Hoy rp. g7@ Tiall - 7,)(1 - rig). 


1 


2 2 2 
No =— (1 - ry) (2riaris - PisTes - Tislas)- (9) 
Tif2s 2 


The next sampling covariance required is that of two Spearman-Brown reli- 
ability coefficients. 
2r, 


Ri = Ex ey 


Taking logarithmic differentials, 


dR, ~ dr, + dr, 7 + 
Ri T, is ¢ Ti Re Te L ? Fa 








Multiplying, summing for all samples, and dividing by the number of samples, 


NOR.Rs _ “rare | Nor irs NOy ry mys, 








RiRg rile (1 + r,)(1 + rg) ri(1 + rg) re(l1 + r,) 


Substituting from (7) and simplifying, 
2 . 
Nop Re = 2ri2(1 - R,)(1 - Re). (10) 


We require, finally, the two types of sampling covariance of a Spearman- 
Brown reliability coefficient and an intercorrelation. For the first type, taking 
the differentials, 


2dr, 


dR, = (l¢+r,)* Gria = aria. 
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Multiplying, summing for all samples, and dividing by the number of samples, 


- eNO, ris 


No a 
Rilia (1 + ne 


Substituting from (8) and simplifying, 


Nop p,, = Faa(l - rie) (1 - Ra) (11) 





For the second type, taking differentials again, 


__ 8dr, d = d 
= G > ri)* Tes — Glas. 


aR 


Multiplying, summing for all samples, and dividing by the number of samples, 


PNoy ras 
No, = 
iTa3 (1 + r,)? 





Substituting from (9) and simplifying, 





NOp res = (1 - Ri) (2riegrisa - rielas - ristas)- (12) 





The basic formulas for further derivations are (1), (2), (3), (4), (10), 
(11), and (12). Im deriving any sampling variance or standard error, the system 
used consists in taking the differential or logarithmic differential of the func- 
tion, squaring, summing for all samples, dividing by the number of samples (a theo- 
retically infinite number of them), substituting from these basic formulas, and 
Simplifying. In deriving any sampling covariance, we take the differentials or 
logarithmic differentials of the two functions, multiply, sum for all samples, di- 
vide by the number of samples, substitute from the basic formulas, and simplify. 

To sum for all samples and divide by the number of samples, we simply substitute 
for each squared differential the corresponding sampling variance, and for each 
product of two differentials the corresponding sampling covariance. This system 

of derivation gives only first approximations to the sampling variances and co- 
variances desired, but closer approximations are not warranted when observed cor- 
relations obtained from the sample must be substituted in the formulas for the cor- 
responding population correlations called for (Pearson and Moul, 1927). 
sumptions involved in these derivations are: 


1. That all samples are drawn from a population normally distributed with 
respect to all the variates measured. 


2. That all samples are drawn from a population in which all the regres- 
sions are linear. 

5. That all samples are large enough so that higher powers of the sampling 
errors are small in comparison with first powers, and may be neglected. 

A further difficulty arises in interpreting the standard errors when de- 
It is usual to assume that the sampling distributions of correlation func- 
tioms are normal, and to interpret their standard errors by reference to a table 
of the normal probability integral. This assumption is unsafe in many cases, even 
though we know that as the size of the sample is increased, the sampling distribu- 
tions of correlation functions approach the normal. The trouble is that for small 
samples, these sampling distributions are often far from normal, and in many cases 
the approach to normality with increase in sample size is very slow. 


The as- 


rived. 


For some of 
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the functions to be considered, a sample of 1000 is still a& small sample. 

In spite of all the difficulties above noted, it is still true that an ap- 
proximate standard error, based on an unknown sampling distribution, is better than 
none. The following standard error formulas are therefore offered for whatever 


they may be worth. It is to be hoped that eventually their exact sampling distri 
putions will all be found. 


nr; 
1 + (n - 1)r, ° 


—— 





Spearman-Brown Formula for the reliability of a test n times as long as the 
half-test. Note that if n = 2, ry = Rj. 


n(l - r:) . 


rn | a+ (nm - 1)r,)* | 





This formula was first given by Shen (1924). 


-_ < riln 


rg 7 See 


Spearman-Brown Formula solved for n to estimate the length of a test (measured 


in terms of the half-test length as the unit) necessary to achieve a given re- 
liability. 


n(l + r,) ° 
ee 


This formula was first published by the writer in 1933. 
Tia 
VR 


Correlation corrected for attenuation in one variate. This variate is usually a 
criterion. 


2\2a 2 
1-f 1-R 1 - Ry , 
oan Tiaz Ri 1 
This formula was first given by Cureton and Dunlap (1930, B), in a slightly dif- 
ferent form. 


2 
Talos 2res 2 2 (2 - a) 
0 = l-r -r ae ok ime cea 
N — 2 | (es) ( 12 is) Ry 





wr Ay 
2 2 2 . Bast .' 
- (1 - ris - Tis - Tas) - (2 - Tia _ sty ( Ry ) |. 
This formula will be needed in finding the standard error of the difference between 


the correlations of two tests with the same criterion, corrected for attenuation in 
the criterion in each case. 


Eag = 1 -\/1 - rag? 
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Index of forecasting efficiency for the case of a "true" criterion. This func- 
tion is discussed by Conrad and Martin elsewhere in this issue of the J. Exp. 
Educ., and is designated Eo x by them. 





a \ 


2 Tn 2 | (1 - ri.\- /1 - R, \* (1 2 1 - Ri 
== — i} | ——ae | + | a ae a ) a 
<< Pa, r2,| \ Tis } Ri } 12/ \ Ri 


/ 














Tia 


r Ss « a © (25 
\) Rika 


ow 






Correlation corrected for attenuation. This formula is equivalent algebraically 
to the formula, 


4 \/ ry Te 






whose standard error has been given by Shen in a very lengthy form. 
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This standard error function was derived by Cureton and Dunlap (1930, A), but 
the formula as published contained an error in one of its terms, and should be 
replaced by the present formula. 
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These last two formulas will be necessary in obtaining the standard error of the 
difference between two correlations corrected for attenuation, as well as in ob- 
taining the standard errors of other functions of correlations corrected for at- 
tenuation. 


Before proceeding further, we define arbitrarily two functions, A and B. 


(1 - rig)(1 - roa) + (risTae - TieT23)"- 
= (ris + rie + Pan + Toa) + Orialsa(Tislas + Tialas) 
- 2rya(Tisles + TiaT2a) - OFaa(TisTia + Teslea)- 
t = Taslaa — Tialase 


Tetrad difference. This is the usual form of this function, as employed in the 
study of Spearman's theory of two factors. 


2 2 2 2 2 2 2 . 
No, =B+t (rig + Tris + Tig + Tas + Tae + Tae - 4)- 
This formula was first given by Kelley (1928, p. 49). 
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Vector correlation. This function was derived by Hotelling, who suggests that it 
be used instead of the tetrad for the purpose of testing the two-factor theory. 


a® — ~B 


(1 - ris)*(1 ” r3.)” 





Nog = (1 - Q°)* - 


This formula was first derived by Madow. Neither of these last two formulas has 
been published to date, so far as the writer is aware. 
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Tetrad of correlations corrected for attenuation. This has been proposed as a 
substitute for t in testing the theory of two factors. 
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A formula for the standard error of a tetrad of correlations corrected for attenua- 

tion has been given by Garrett and Anastasi (1952). Their formula is based on the 
d 

wholly inadmissible assumption that the sampling correlations T Ew Tosy an Tew 


are equal respectively tor . Their formula should be superseded 


and r 
Ti2Tis Ti2T34 
entirely by the one given above. 
r sy r, r, Tr Pi3 i 
eo ee «SZ ss * (34) 
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The triad. This function is equal to the proportion of the non-chance variance of 
X, which is due to a general factor running through all three variates, if the theory 
of two factors holds, so that the three variates may be considered to be composed of 
one general factor and three specific factors. This function has been discussed at 
some length by Cureton and Dunlap (1930, C). 
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This formula will be needed in computing the standard error of the difference between two 
triads, in order to estimate the significance of the difference between the general-factor 
saturations of tests x; and x5. Finally, we may note the value of the symmetric determinant, 


1 Tiz Tis Tia 
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III. The Case of Experimental Independence 


This case has been treated by the writer elsewhere at some length (Cureton, 
1931). No effort will be made here to summarize this work as has been done above 
for the case of comparable measures. 

There is one special case, however, which is of interest in connection witn 
the paper by Conrad and Martin in this issue of the J. Exp. Educ. The index of 
forecasting efficiency for the case of a "true" criterion may be estimated by means 
of the second procedure, whether experimental independence is obtained or not. This 
index is a function of one set of test scores and two sets of criterion measures. 
If these measures are experimentally independent, the reliability of the criterion 
will be an "average reliability." If they are not, it will be an "instantaneous 
reliability.” In either case, we compute r,; = ryy, the reliability coefficient of 
the criterion, and also rig and rjg, the correlations between each set of criterion 
measures and the test scores. Comparability is not demanded, whether experimental 
independence is present or not. The only requirement is that errors of measurement 
in the two criterion measures shall be independent of each other and of the true 
abilities of the subjects. In obtaining this necessary independence, the investi- 
gator is free to divide his criterion into halves in any way he chooses. The two 
half-criterion measures may be wholly unequal in number of items (average of 2 
judges against average of 3, e.g.), units of measurement, standard deviations, and 
correlations with the test. The index may be written in the present notation, 
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Let “iatla =F. Then Ew~=1 - \/ 1 - fF. Taking differentials, squaring, summing 
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for all samples, and dividing by the number of samples, 
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The value of of is known (Cureton, 1931, p. 55, Formula 17). Substituting in the 
above equation and simplifying, 
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ubstituting from (5) and (6) and simplifying, 
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Substituting from (5) and (6) and simplifying, 


1 2\ 2 2 
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THE GENERAL NATURE AND APPLICABILITY 
FOR EDUCATION-+ 
by 
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M 


hie 


Scates 


Director of School Research 
Cincinnati, Ohio 


Contrary to the belief held by 
many persons, index numbers are not re- 
icted to the realm of general econom- 


wnich do not directly involve money in 
any way. Index numbers have received a 
wider use in the field of education than 
is generally recognized. Perhaps their 
varied use in educational problems is not 
better known because they have not been 
expressly treated in educational litera- 
ture as_a technique of general applica- 
bility.” The procedure has been borrowed 
from economics and applied to education- 
al problems by individual workers with- 


out attention having been called explicit-| 


ly to the nature of the adaptations that 
have been made, and to the possibilities 
of further use, both for immediate serv- 
ice and for research. It is the purpose 
of this article to make a beginning in 
this direction. It will discuss some of 
the general characteristics of index num- 
bers,°© and refer to the uses that have 
been made of them in education. 





| 





index number 
fords an important 
(measuring) complex varia 
especially adapted 
in phenomena which cannot 
identified except in te 
component or resultan 
take a common example 
"cost of living", is 
which does not exist 
tity, and cannot be dealt 
as a single object. Its ; 
upon the many variable elements whi 
compose it. The same thing is true 
regard to many of our concepts of qual 
ties or properties of ‘omplex 
phenomena--i.e., those which comprise a 
number of different kinds of elements. 
If one desires to measuré 
character as "cost of living’, 
of school systems, or 
of school systems, one 
four courses open to him: (1) he can 
rely on a single selected characteristic 
which can be directly measured unt 
ed, to represent the general character; 


The 


means 


+ m ac 
vv meas 


etat) 
~ ws 


logically 


"seneral goodn 


theoretically has 
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This article is based in part upon material in a section of a forthcoming book, The Methodology 
of Educational Research, by Carter V. Good, A. S. Barr, and Douglas E. Scates, to be published 
this spring by the D. Appleton-Century Co., New York. 

Exception should perhaps be made for two articles by Clark; his treatises are however limited to 





price index numbers. 


Harold F. Clark, Index Numbers in School Administration. 


Bulletin of the 





School of Education, Indiana University, III (January, 1927), No. 3. 


Bureau of Cooperative Research, August, 1928. 
See also, by the same author: 
Record, XXX (February, 1929), 453-60. 


Frisch gives an admirable summary of the characteristics and the theoretical 
The treatment however is restricted to price index numbers. 


bers. 


of General Economic Theory: 
1-38. 


Bloomington, Indiana: 


P. 55. 


"Index Numbers in Educational Work", Teachers College 


bases of index num 


oe le ia - 
Ragnar Frisch, "Annual Survey 


The Problem of Index Numbers", Econometrica, IV (January, 1936), 


Measurement procedures appropriate for different kinds of variables are treated by the writer in 
a forthcoming article in Psychometrika entitled, "The Essential Characteristics of Measurement." 
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(2) he can utilize some effect (resultant 
or dependent variable) of the character 
he is striving to index; (3) he can rely 
upon personal estimates of the degree to 
wnich the characteristic or quality ex- 
ists in different situations; or (4) he 

utilize tne index number technique to 
combine two or more quantifiable charac- 
teristics which are included in, or defi- 
nitely correlated with, the general con- 
cept which he desires to index.° This 
last procedure is usually the most satis- 
fying, if it can be carried out properly, 
because it reflects variations in the 
factors which (normally) contribute to 
variation in the phenomenon being studied. 
It may be pointed out also that there is 
no mathematical reason why certain resul- 
tant variables could not be included also 
seemed logically desirable to do so, 
or if it had been proved by research to 
be helpful to do so. 

Index numbers are most commonly 
thought of as applying to variation from 
time to time, as from one year to an- 
other, but there is nothing in their na- 
ture that makes them more applicable to 
time series than to variation expressed 
in relation to any other significant fac- 
tor (presumably an independent one). In 
the case of index numbers referred to 
time, the particular times for which the 
numbers are calculated serve merely as 
designated points, expressed in terms of 
the variable "time", for the observation 
of the values of the sundry components. 
In the case of index numbers which re- 
flect variation from place to place, the 


can 


ix iv 
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locations serve as observation points, 
and relate the recorded data to the sig- 
nificant variable, "space", or "place." 
Any other general variable that is ap- 
propriate to the particular problem 
might be used just as well, in lieu of 
time or space. In fact, the place-to- 
place index numbers are really related 
more closely to "situation" than to 
"space" as a variable, for observations 
are not taken at regular intervals 
(along a geographical line), as in the 
case of time, but are rather taken at 
certain locations (usually cities) 
identifiable by common knowledge as rel- 
atively unique conglomerates. The index 
number technique is applicable wherever 
a general variable to be indexed can be 
regarded as representable by the summa- 
tion of (weighted) percentage variations 
in a number of elements, the observa- 
tions of these elements being taken at 
identifiable points. 

An index number may be looked 
upon either as a weighted average of ra- 
tios or as a ratio of weighted summa- 
tions (aggregates). Fisher® lays stress 
on the fact that it is an average of ra- 
tios, while King? is equally emphatic in 
stating that it is a ratio of aggregates, 
and Young® recognizes three different 
types. Such distinctions are scarcely 
of mathematical significance, since a 
weighted average is practically a ratio 
of aggregates; these emphases may, how- 
ever, be helpful at different times in 
seeing the nature of index numbers from 
different points of view under varying 
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cially pp. 46-49. New York: 
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5. Note however that there are certain limitations or restrictions in the interpretation of index 
numbers, as pointed out by Leontief, and others. 
and the Problem of Index Numbers", Econometrica, IV (January, 1936), 39-59. 
on the preceding page) also deals with this problem. 
For definitions of index numbers, see Irving Fisher, The Making of Index Numbers, Chapter I, 
Publications of the Pollack Foundation for Economic Research, No. l. 
Houghton Mifflin Co., 3d ed., rev., 1927, (2d ptg., December 1931), p. 538. 

Willford I. King, Index Numbers Elucidated, Chapter III, "The Nature of Index Numbers", espe- 
Longmans, Green and Co., 1930, p. 226. 


See Wassily Leontief: "Composite Commodities 
Frisch (footnote 3, 


It is outlined later in the present paper. 





See the comments of Allyn A. Young, p. 181, in Handbook of Mathematical Statistics, ed. by 





Houghton Mifflin Co., 1924, p. 221. 
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circumstances. While index numbers are 
commonly conceived of as consisting of 
two sets of factors, one a set of items 
or elements (p) of the general concept, 
and the other a set of weights (q) ap- 
plied to these elements, it is readily 
possible for index numbers to embody 
three or more sets of factors. This may 
be accomplished by combining two or more 


variables into some function of them, and | 


using the resulting values as a single 
set of factors in the index number, or it 
may be done through adapting the formula 
to accommodate additional factors.9 

One of the features of an index 
number which distinguishes it from just 


any weighted composite is that variations 


in the value of the items are conven- 
tionally expressed as per cents. These 
per cents or ratios represent the chang- 
ing values of the items referred to their 
value at some selected point (time, or 
place, or situation), designated as the 
base, for which the index number will be 
100. This use of ratios affords a unit 
of measure which will be comparable from 
item to item, at least within a certain 
sense. The selection of a base point (a 
given year, locality, or situation), with 
its attendant value for each factor, 
while not entirely without effect on the 
resulting comparisons, is not a crucial 
matter, and may be done arbitrarily to 
suit convenience. Various forms of base 
are sometimes used, such as an average of 


the values at several points, or a moving 


base, resulting directly in link rela- 
tives, which may 
back to a single fixed base. 
Occasionally a special form of 
base is used, and this may not grow di- 
rectly out of actual values. 


subsequently be referred 


Thus Ayres, 


| 


Douglas E. Scates 
oe 


in his historic index numt 


school systems;#Y did not 
but 

theoretical values representing stand- 
ards. indexes of business or 
economic activity which reflect varia 
tions above and below normal, 
sented by Babson's charts, or "American 
Business Activity § 1790° ; 
express the values as per cents 
fixed base, designated at some 
time, but rather as per cents 
which is a function of the 
variations themselves, and which 
therefore, in @ sense, a 
able. 


values as a base, 


Again, 


as repre- 


ince io not 


aggregat 


dependent 
It is not, however, appro; 
to regard every series of ratio 
dex numbers, even 
are called such. 


though many of 

The concept of a 

dex number posits a variety of elements 
which will be combined into a summation 
to represent as closely as possible a 
complex variable, and a single element 
in itself does not satisfy this condi- 
tion. Thus, the use of holding power of 
the schools,@ or the average number of 
days of attendance for each child of 
school age, 1S as an index of the effi 
ciency of school systems, can scarcely 
be regarded as an illustration of index 
numbers. On the side of logic, it 

be said that these elements 
ple to constitute adequate represe 
tions of school efficiency in any ; 

al sense, however satisfactory they 

be for certain special purposes. An 
dex number presumes to be generally 
resentative unless explicitly limited. 
On the technical side, it be said 
that such series are simply values 
turned into per cents. 


are to 


may 


Young aptly 
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terms them "relatives", or "relative nun- 


bers." They constitute single elements 
which, along with a number of other dif- 
ferent elements, might be combined to 
form an index number. Here the concept 
f an index number as an average of sev- 


eral 

Again, the use of natural aggre- 
gates, even when expressed as percentage 
variation 
to place, 
istics 


calculate 


of an index number. 


trends showing the change dur- 


ing the past ten or twenty years in school 


enrollment, in school building construc- 
tion,4 or in school expenditures, using 
totals for states or for the nation. 
may 
sent summations of many component vari- 
ables--more, in fact, than could be se- 


cured for constructing an index number, 


and they are weighted exactly right. Such 


contentions are probably fair, but the 


index number technique is a procedure for 


such series of ratios is serviceable. 


from time to time or from place 
lacks certain of the character- 
Thus, one may 


It 
be argued that as totals these repre- 





weighting and combining various selected 
elements, and the employing of natural 


totals precludes the application of these 


processes. The matter at issue is, of 
course, one of definition, and not of 
value. The availability of complete to- 
tals which meet the conditions of the 


general concept to be indexed is extreme- 


7 


ly fortunate, and such figures are _ su- 
perior for most purposes to any that an 
index number could yield. 
as per cents, they should, however, be 
called "ratios", or "relatives." 


When expressed 
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One occasionally finds the term 
"index number" used where practically 
none of the characteristics of a conven- 
tional index number are present. 
example, an index number of teacher 
training }® and an index number of first 
year university work!” have been report- 
ed, neither of which was expressed in 
per cents, nor prepared by combining da 
ta for a number of separate elements. 
Such use of the term for special indexes 
developed for particular purposes and 
representing almost any kind of function 
must be regarded as colloquial, and un- 
desirable; it cannot help being mislead- 
ing, both as to the nature of what is 
presented, and as to the general nature 
of index numbers. Burns, in calculating 
his index of transportation need; prop- 
erly refrains from calling it an index 
number. 


For 


A third characteristic of the 
typical index number is that it is a sam- 
ple. Probably this characteristic would 
not be set up as a requirement, but it 
is at least typical. An index number is 
expected to represent the fluctuations 
of a general category, or of a large 
class of elements, through being calcu- 
lated from a selected group of elements 
which are a representative sample of the 
entire class. In other words, an index 
number is expected to be the basis of a 
generalization about a larger group of 
variables than those which are actually 
included in the calculation.19 

It is interesting to take notice 








14. See for illustration, "The Nation's School Building Needs", Research Bulletin of the National Ed- 


ucation Association, XIII (January, 1935), No. 1. 
John K. Norton, The Ability of the States to Support Education. 


15. 





Figure II on p. 7, and Figure VII on p. 27. 
Washington, D. C.: The Nation- 





al Education Association, 1926. P. 88. 


16. W. R. Burgess, 
III (March, 1921), 161-72. 
W. R. 
Research, IV (October, 1921), 180-86. 
17. Douglas E. 
XXXII (March, 1924), 182-192. 
18. 


Scates, "A Study of High-School and First-Year University Grades." 


Robert Leo Burns, Measurement of the Need for Transporting Pupils. 


"The Education of Teachers in Fourteen States", Journal of Educational Research 





Burgess, "The Rate of Progress in Teacher Preparation", Journal of Educational 





School Review, 





Contributions to Education, 





No. 289, N. Y.: 
19. 


Teachers College, Columbia University, 1927. P. 61. 


The restrictions earlier called attention to (see footnote 5, page 266) concern primarily the 
interpretation of 


index numbers, and should be borne in mind whenever they are used. 


March, 1936 a 
of the several different ways in which 
this sampling may occur. In the first 
place, the general class may be sampled, | 
as just referred to, by selecting cer- 
tain constituent elements. Thus, not all 
kinds of articles sold at wholesale, not 
all elements of size or merit of school 
systems, would or could be included in 
an index number of wholesale prices or of 
excellence of school systems. Certain 
ones would be chosen to represent the en- 
tire group. On the other hand, an index 
number of school bonds issued for build- 
ing construction deals with a concept 
which is relatively homogeneous, and, 
after an appropriate definition for the 
class has been worked out, there would 
not appear to be any problems of sampling | 
the constituents of the class, since 
(presumably) whatever sub-classes there | 
were would behave approximately alike. In 
similar manner, one could construct an 
index number for other general variables | 
which were either simple (with respect to 
sub-classes of significant elements) or | 
hich were reasonably complex but com- 
pletely represented, without encounter- 
ing important problems of sampling the 
constituents. As an example of complete-| 
ly representing a complex variable, a 
manufacturing company might make an 
analysis at the beginning of each year,or 
each quarter, of the orders on hand and 
the detailed operations required on each 
of its machines, to prepare an index num- 
ber of its production load for the next 
period ahead. 

A second way in which sampling 
may occur is with reference to the field. | 
Index numbers of commodity prices common- | 
ly sample both the class and the field; 
that is, they include only selected items, 
and they price these at selected locali- 
ties throughout the country. An index 
number of a simple case (as school bonds), | 
would normally sample only the field; for 
example, prices would be secured from 
various localities, each price probably 
being weighted by the number of bonds sold| 





| 





| variable 


| different localities these 


ferent situations" 
| reference variable. 


| resenting complex variation. 
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(either at that price, or in that local 
ity). Field samples may, of course, be 
taken with respect to any significant 
in the field--time, 
specified condition. 

Data which are complete with 
reference to geographic distributior 


aay 


or 


ense 
wVWAL » 


| such for example as index numbers for 


all of the forty-eight states (varying 
from place to place), may be regarded as 
field samples with respect to s ther 
variable, as special condition, or pos- 
sibly time. Perhaps it is worth call- 
ing attention to the fact that the sanm- 
pling will not be done on the independ- 
ent or reference variable which is con 
sidered of major significance; that is, 
sampling prices of commodities by taking 
selected localities throughout the na 
tion is done to secure a representation 
of the prices generally prevailing at 
that time for an index number whose chief 
independent variable is time. (While 
index numbers are reported for prices at 


Ime 


are not piace 
to-place index numbers, since their per- 
cent variation is expressed on a time 
and not a location base.) Similarly, 
sampling with respect to time is appro 
priate when the chief independent vari 
able is something else. For example, 
time sampling of children's behavior in 
observational studies is appropriate 
when variation is not to be related 
time but to "different pupils" 


to 
or "dif- 
as the independent 
Attention was earlier called to 
the value and appropriateness of the in 
dex number technique generally for rep- 
Since mul- 
tiple regression equations also provide 
a means of indexing complex variation, 
in terms of a summation of weighted conm- 
ponents, it may be desirable to make cer- 
tain comparisons of the two techniques. 


| Some of the distinctions exist with re- 


gard to form; perhaps the most important 
ones exist with regard to use. For 
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example, the multiple regression equa- 
tion requires a suitable quantitative 
criterion, whereas this may be the very 
thing that one is seeking.*9 It is in 
fact quite possible that an index number 
representing some complex phenomenon 
might be constructed to serve as the de- 
pendent variable for the purpose of solv- 
ing a multiple regression equation. In 
addition to the requirement of a criteri- 
on, the multiple correlation technique, 
as commonly used, requires linear rela- 
tionships; it is not well adapted to a 
large number of variables; and the system 
of weighting is reiatively simple. In the 
case of index mambers, on the other hand, 
there is no requirement of a mathematical 
critericn, there is no assumption as to 
the mathematical form of relationships 
between the values of the elements, the 
technique may be readily extended to in- 
clude hundreds of component variables, 
and the weighting may be built upin prac- 
tically any form desired, even varying 
the form with the particular element if 
this is appropriate. 

Interesting perhaps more fromthe 
theoretical standpoint than the practi- 
cal, is the fact that variation from ob- 
servation to observation is not essen- 
tial to an index number, while it is a 
requirement for correlation. That is, 
index numbers might conceivably continue 
to be 100 for several successive sets of 
observations; if correlation were at- 
tempted for these values, the coefficient 
would be either zero or indeterminate 
(0/0), for the entire scattergram would 
be concentrated along one axis or at a 
Single point. That is, a series would 
consist of constants. Of somewhat more 
practical significance is the fact that 
an index number requires only two sets of 
observations (that is, two observations 


for each variable or element), so that an | 


index number becomes possible as soon as 
a second observation point has been 
reached. Simple correlation can be cal- 


culated between two observations for each 





| 
| 
| 
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variable, but it is either 1.00, -1.00, 
zero, or indeterminate, and the partial 
correlations are chaotic and meaningless 
in most cases. In both techniques dif- 
ferences in means from one variable to 
another are immaterial; differences be- 
tween the dispersion of one variable and 
that of another have the effect of weight- 
ing in index numbers, while in correla- 
tion they are of no effect unless the 
variability of a certain element is ab- 
normally restricted, as by selection, in 
which case the correlation is lowered 
and thus ultimately the weight of that 
element is reduced, as in the index nun- 
ber. In both techniques the separate 
(weighted) variables are summed, though 
in certain of the more elaborate (and as 
yet little used) forms of the regression 
equation, product terms appear. 

As index numbers are commonly 
used, they can be more safely interpret- 
ed in terms of cause and effect than can 
the majority of correlation coefficients. 
This fact does not grow directly out of 
the nature of the mathematical relations 
so much as it does out of the selection 
of variables which is usually made. In 
most applications of index numbers, ele- 
ments are included which have a known 
and demonstrable relation to the general 
category indexed. That is, for example, 
an index number of building costs will 
consist of such factors as labor and 
building materials--variables which ob- 
viously contribute rather directly to 
fluctuations in building costs. Corre- 
lation coefficients, on the other hand, 
are commonly calculated to see whether 
any relationship between one variable 
and another exists, and if a mathemati- 
cal relationship is found, the structur- 
al form of this relationship still has 


| to be ascertained before anything can be 


said with regard to cause and effect. In 
this respect, index number and correla- 
tion techniques frequently proceed in 
opposite directions. 

In thus differentiating between 









20. While Hotelling has developed an ingenious extension of the usual conception of criteria, or de- 
pendent variables, his method still does not make them universally available. 


See Harold Hotel- 


ling, "The Most Predictable Criterion", Journal of Educational Psychology, XXVI (February, 1935) 
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the requirements of the two techniques, 
there is no intention of implying that an 
index number is better than a multiple 
regression equation in any particular 
case. It is simply applicable to certain 
kinds of problems under a wider range of 
conditions. If an independent criterion 
of the complex (dependent) variable were 
available, one could determine by corre- 
lation whether an index number or a re- 
gression equation provided the best meth- 
od of approximating it (assuming the nec- 
essary variation). It will also be rec- 
ognized that correlation techniques have 
their own unique sphere of service, into 
which index numbers do not intrude, such 
for example as determining relative con- 
tributions. 

At this point we may turn atten- 
tion to certain considerations with re- 
gard to the actual preparation and useof 
index numbers, It would seem that there 
are six phases of the work which should 
be given attention. The first of these, 
that of selecting elements or factors to 
be included, has already been referred 
to. Assuming that the general variable 
to be indexed is complex, one's first 
thought will be to use elements which are 
definitely representative of this gener- 
al category or class. This is partly an 
abstract matter, partly an empirical one. 
Questions of logic, of interpretation, 
and of definition will be involved; also, 
one's acquaintance with the factors which 
are available for measurement will in- 
fluence his decision. It is somewhat 
common for one's judgment to be too heayv- 
ily influenced by this second group of 
considerations, especially in attempts to 
measure status with respect to abstract 
concepts, such as "merit." It should be 
borne in mind that while multiple corre- 
lation has its criterion variable, (fre- 
quently resting very heavily upon judg- 
ment), the index number depends for its 
validity directly upon the component 


variables which are included init, and 





21. A question may arise as to whether to weight by 
Certain formulas use a combination of the two. 
importance. 
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these are usually selected on the basis 
of individual or group judgment. 

There are situations in which 
the selection of items to be included in 
the index number is more a matter of 
sampling of the category than it is a 
matter of judgment as to what factors 
are properly embraced by the general 
concept. For example, in the field of 
wholesale prices, anything sold at whole- 
sale might be included, and selectionis 
largely a matter of sampling of consti- 
tuents, as previously discussed. Ifa 
question concerning the importance of a 
particular item were raised, such a ques- 
tion would be answered by pointing out 
that the commodity in question was 
weighted in accordance with its volume 
of turnover. Where no serious question 
enters as to what properly constitutes 
an element in the general concept repre 
sented, the matter of judgment is not 
so prominent. It is of course recog- 
nized that in any field, problems of 
definition will occur. 

Proper weighting is a second 
matter to be considered in constructing 
index numbers. In the field of simple 
price indexes, one may weight by the 
quantity sold, and there can be little 
argument .© In indexes for more varie- 
gated or more subjective characteris- 
tics, such as general price level, gen- 
eral business activity, or the efficien- 
cy of school systems, the problem of 
weighting becomes more uncertain, and 
frequently it must rest largely on judg- 
ment after careful analysis. In any 
class (general concept) having widely 
differing kinds of elements, which are 
variously received or used (purchased) 
by different persons, weighting can rep- 
resent only a sort of average of field 
conditions, and will not usually be ac- 
curate for any particular case (person, 
or group of persons). For example, with 
reference to "cost of living", families 
on different economic levels, or on the 


the quantities in the base or in the given year. 
Exact weighting may not be of large practical 


See Fisher, op. cit. (footnote 6, page 266), pp. 528, 346-48, 452, 447-49. 
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same economic level but with different 
tastes, spend very different portions of 
their budget for elements which enter 
into the cost of living index, so that 
any weighting used will probably not fit 
any large number of families.©* The same 
complication may arise in other applica- 
tions of index numbers. Further, when 
reciprocals of index numbers are used, as 
for the purpose of indicating the gener- 
al purchasing power of money, the rela- 


| 
| 


| 
| 
| 
| 
| 


tive weighting is changed, after the man-, 


ner that the weighting of items in the 
harmonic mean differs from the weighting 
of the same items in an ordinary arith- 
metic mean, ‘The matter of weighting, 
therefore, presents complications which 
are not readily solved. 

Reference may be made in passing 
to a prevalent misconception that when 
one uses data as collected, the data are 
not weighted. Such data have, however, 
@ natural weighting which is just as real 
(and may be just as wrong for a certain 
purpose) as any artificial weighting that 
may be assigned. In the case of index 
numbers, the natural weighting will arise 
from the differences in the variability 
of elements. On the other hand, some 
workers, when combining series to form a 
composite, mechanically proceed to re- 
move this natural weighting and reduce 
all of the variables in their studies to 
equal weighting, presumably with the 





|; makes his work "objective." 


thought that they have thereby relieved 
themselves of responsibility for judg- 
ment, and have made their work perfect- 
ly objective. While this is not likely 
to be done in the case of the elements 
of index numbers, where a formula is 
usually followed, one may, in the same 
frame of mind, omit values for the nomi- 
nal or assigned weights of the elements, 
believing that a uniform weight of unity 
It should 
be made explicit that equal weighting is 
likely to be much less justified than 
approximate, arbitrary weighting. 
psychological fields the matter of 
weighting is primarily one of judgment, 
barring the extensive research which 
might remove judgment another step, and 
one does not improve his work by failing 
to exercise judgment where it is called 
for. 


In 


Definition of detailed concepts 
constitutes a third important aspect of 
gathering data for index numbers. Ele- 
mental classes, and measures of then, 
must be uniformly defined. While obvi- 
ous, this matter is frequently not given 
appropriate attention. For example, 
such an apparently simple thing as "one 
day of attendance" varies a great deal 
in its concept from one school system to 
another. Phillips gives an illuminating 
discussion of this difficulty 5 As in 
any area of critical work, the perception 





22. For an index number to be generally interpretable as having a precise significance for each of 


the various situations (persons or groups on different economic levels, or situations varying in 
any other factor that is related to weights) calls for the assumption that all of the weights 
will vary in the same proportion from one situation to another (e.g., from one income group to 


another). 


This obviously is not likely to occur. 


The condition can however be satisfied if, for 


each magnitude (class interval) of price ratios, the average of weights is constant from situa- 
tion to situation; or if, for each magnitude (class interval) of weights, the average of price 


ratios is constant from situation to situation. 


That is, the correlation ratio between weights 


and situations with price ratio constant, must equal zero; or the correlation ratio between 


price ratios and situations, with weights constant, must equal zero. 


Other conditions which are 


theoretically satisfactory are that all of the weights for the various commodities should be 
equal, that all of the price ratios for the various commodities should be equal, or that the 


weights for the various commodities should not vary from situation to situation. 
conditions which will be satisfactory can be found. 


Perhaps other 
These statements refer of course to the 


true weights in the field; there are as many weights for each commodity as there are situations. 


The statements are made in terms of economic concepts because these are more readily followed. 
Application to other fields may be readily made. 





Frank M. Phillips, "Educational Rank of the States, 1950", Section II, "Uniform Definitions, Rec- 


ords, and Reports", American School Board Journal, 84 (March and April, 1952). Also in a pamph- 


let of same title, published by the author, Washington, D. C., 1952, pp. 25-40. 
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of difficulties, (when not exaggerated), 
is in itself an indication of some skill 
and maturity, and common units and 
traits (variables) are frequently not 
well defined principally because workers 
nave not developed a sufficient familiar- 
ity with the field in which they are con- 
ducting research. 

A fourth matter for attention is 
the sampling of the source-field. Per- 
haps this has been discussed sufficient- 
ly in connection with the general nature 
of index numbers. It is similar to the 
problem of sampling in any research work. 
It may be that in some instances complete 
field data can be secured. In any large 
study, however, this is impossible. In 
the case of prices, one cannot gather da- 
ta on all of the prices of any single 
commodity in every city and village in 
the United States, so one resorts to 
sampling of the field, and gathers data 
from a certain number of cities which he 
believes will also represent other cit- 
ies,“4 King™ contends that the problem of 
sampling (referring probably to sampling 
both of the constituents of the general 
variable and of the source-field) is the 
only real problem of index numbers. While 
other writers recognize the importance of 
sampling they do not concur in such an 
extreme emphasis .26 

Fifth, we have the form of index 
number to be used. This form will in 





part control, or be controlled by, the 
weighting to be given the various ele- 
ments. It will also determine cer- 
tain other characteristics of the re- 
sulting values. Fisher®’ gives the most 
extended treatment of formulas, though 
he does not exhaust the possibilities. 
He discusses six types of averages and 
six types of weighting (p. 351), and ana- 
lyzes the resulting index numbers on the 
basis of several criteria. He concludes 
that his formula No. 353 is the "ideal" 
one (see his pp. 360 and 493), but that 
formula No. 2153 is more easily calcu- 
lated, and is practically as good 
(pp. 361 and 494). His formula No. 53 
(estimated to be correct within one per 
cent, pp. 362 and 494) is both rapid and 
simple to explain to non-technical work- 
ers. It is the form used by the U. S. 
Bureau of Labor Statistics in calculat- 
ing the index numbers of wholesale prices, 
retail prices, and cost of living. It 
will be found satisfactory for most pur- 
poses. His formula No. 1, which is a 
Simple average, is the one that has been 
generally used in educational studies; 
Fisher says of it that "It should not be 
used under any circumstances, being al- 
ways biased and usually freakish as 
well,*28 

Other convenient sources of index 
number formulas are Kelley £9 Young and 
most books on statistical methods in 





24. A description of the methods by which the United States Bureau of Labor Statistics gathers data 


from the field for its index numbers is given in the following bulletin: 
and Computing Statistical Information of the Bureau of Labor Statistics. 


Methods of Procuring 
Bulletin of the U. S. 








Bureau of Labor Statistics, No. 526. 


Washington, D. C., March, 1925. P. 
These methods have recently been modified, as described in: 


54. 
Faith M, Williams, Margaret H. Hogg, 


and Evan Clague, "Revision of Index of Cost of Goods Purchased by Wage Earners and Lower-Salaried 
Workers", Monthly Labor Review, XLI (September, 1955), 819-37. 

King, op. cit. (footnote 3, page 266), p. 49. His position is further developed in Chapter IV, 
"Sampling as Related to Index Numbers", pp. 59-77, and Chapter VII, "Percentages of Error Found 





in Certain Price Indices", pp. 143-88. 


See, for example, Fisher, op. cit., (footnote 6, page 266), pp. 556-40, and 524-25. 


Fisher, op. cit. (footnote 6, page 266). 
Ibid., pp. 361, 466; see also pp. 64-6. 


Truman L. Kelley, Statistical Method, Chapter XIII, "Index Numbers", pp. 551-47. 





Macmillan Co., 1923. pp. 590. 
ly Fisher's formulas Nos. 353, 2155, and 55. 


New York: The 


On pp. 344-45, Kelley's formulas Nos. 15, 10, 12 are respective- 


Allyn A. Young, "Index Numbers", Chapter XII, pp. 161-194, in Handbook of Mathematical Statistics, 


H. L. Rietz, editor. Boston: 


Houghton Mifflin Co., 1924. 





P. 221. Young's formulas Bos. 1l, 


15, and 10 are respectively Fisher's formulas Nos. 555, 2155, and 55. 
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Frisch's review includes 
A very practical and com- 


economics. 
many of them. 
prehensive treatment is given by Croxton 


and Cowden.°© One should also consult is- 
sues of the Journal of the American Sta- 
tistical Association for a number of 
years back, for many practical and theo- 
retical articles. Most of the useful 
formulas do not present mathematical dif- 
ficulties, though they may appear to do 
so to the tyro. They are generally stat- 
ed in terms which grow out of the econom- 
ic field; e.g., p stands for the price of 
a commodity, and q stands for the quan- 
tity of this item that was sold. To 
translate these symbols into non-finan- 
cial terms, p would stand for the value 
observed for any particular elemental 
variable at any particular time, place, 
or circumstance, and q would be the 
weight assigned to this element. 

Sixth, and finally, we should 

the matter of interpretation. 
always a critical step in re- 
search, and particularly so when a4 mathe- 
matical formula of some complexity has 
been employed. The more refined the 
mathematical reasoning by which the form- 
ula has been derived, the more careful 
one must be in applying and interpreting 
it, for many assumptions are likely to 
have been made, either expressly or in- 
pliedly--mostly the latter. Upon careful 
examination, the interpretation of index 
numbers presents more difficulties than 
are at first apparent. The difficulties 
lie in part in the weighting, and in part 
in the philosophy of value. 

The chief source of the difficul- 
ty, so far as weighting is concerned, 
arises from the fact that there is actual- 
ly a fourth conditioning variable operat- 
ing which is generally omitted from the 
calculation, but which cannot logically 
be overlooked in the interpretation. Thus, 
to illustrate in terms of an economic in- 
dex number, we commonly recognize the 
three variables time (or place), price, 
and quantity (weight). The fourth vari- 





mention 
This is 













able is the individual buying habits of 
persons, families, or groups, which ex- 
hibit varying patterns of weights for 
the different elements or commodities, 


and thus make a single or constant 
tern of weights unrepresentative of 
their particular situation. Accordingly, 
one could not select a city in which to 
live by using a cost of living index, 
unless his spending habits conformed 
closely to the weights used in the index 
number. One stands the chance of his 
weighted index number representing no 
actual group or situation at all--which, 
of course, is true of nearly all aver- 
ages, but particularly of means. 

To illustrate the weighting dif- 
ficulty in terms of appraisal, we may 
assume a hypothetical case of a prospec- 
tive student using index number ratings 
on various colleges as a basis for his 
selection. He is interested, let us as- 
sume, in a school in which a broad pro- 
gram of extra-curricular activities is 
emphasized. Index numbers which use an 
average (constant) set of weights for 
all colleges in the country would scarce- 
ly afford «a satisfactory comparison for 
his purpose. They would reflect no 
higher rating for an institution giving 
a great deal of attention in its educa- 
tional program to extra-curricular ac- 
tivities than for another institution 
which gave the same quality of work but 
allowed only for a very small amount of 
the student's time in this area, because 
the quality in both cases (which we as- 
sumed to be constant) would be weighted 
by the same weight. 

The second difficulty, as stated, 
lies in the philosophy of value. Per- 
haps "value received" represents a fifth 
variable--and one which also is omitted 
from the calculation. Thus, in the 
field of economics, as prices vary in 
different ways for different commodities, 
the quantities purchased change. A per- 
son may purchase X units of commodity A, 
and Y units of commodity B when these 


pat- 
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are at a certain price; but if the price 
of commodity A increases more rapidly 
than that of B, one may purchase fewer 
units of A, and more units of B than be- 
fore. Assuming that his total expendi- 
tures have not been increased, his total 
satisfaction may be greater than it 

would be if he continued to purchase the 
original quantities of commodities A and 
B In other words, as prices change, 
quantities of different items purchased 
also shift, in order that one may get the 
maximum amount of satisfaction from his 
spendable income. It is thus possible, 
under changing conditions, for one to re- 
ceive as great satisfaction as formerly 
with an expenditure that has not changed 
as much (or as little) as a mechanical 
average of price ratios, in their origi- 
nal quantities, would call for. Much de- 
pends upon the flexibility of the indi- 
vidual's scheme of values, and the com- 
pensations that may be made. The analo- 
gous application to values in appraisal 
should be clear; a deficiency in one as- 
pect may be more than compensated for by 
some special perfection in another as- 
pect, giving a joint result, or pattern, 
which is superior to the sum of the in- 
dividual ratings of the two aspects, 
weighted with established weights. Also, 
as in the case of economics, the flexi- 
bility or possibility of such compensa- 
tions will depend somewhat upon the in- 
dividual who is concerned (the user). A 
fixed scheme of arriving at an index nun- 
ber value may prevent it from reflecting 
such compensations, and hence from repre 
senting accurately the variations in 











true value which exist. 

Such problems of interpretation 
are not important where the general con- 
cept or class being indexed is simple, 
or relatively homogeneous. Here the in- 
dex number may be interpreted as repre- 
senting percentage variation, and as ap- 
plying generally. Even when the con- 
cept or class is heterogeneous, one can 
make the same interpretation if he is 
willing to do so "on the average"; with- 
out interpreting his result as being ap 
plicable to any particular or single 
case. To withhold inferences from in- 
dividual cases is difficult, unless one 
thinks strictly in terms of totals-- 
such, for example, as the general (aver- 
age) level of education in different 
states, as wholes, and without any sig- 
nificance for individual cities in the 
states. Even here the problem of sub- 
limation of value through substitutions 
is present. Such considerations lead 
Leontief™ to refer to index numbers in 
general as "statistical approximations 
to a theoretically indeterminate con- 
cept." 

In order to illustrate more con- 
cretely the applicability of index nun- 
bers to educational problems, examples 
of their use will be mentioned. We may 
begin by referring to those which deal 
with economic aspects of education. In 
this field, H. F. Clark, and certain as- 
sociates, prepared three series of price 
indexes, covering respectively the cost 
of school supplies % the price (interest 
rate) for school bonds $®* and the cost of 





53. Leontief, op. cit. (footnote 5, page 266), p. 45. 
$4. Harold F. Clark and John Guy Fowlkes, "Index Numbers for School Supply Prices." 


school buildings 56 The latter field has 


Appeared month- 


ly in the Nation's Schools from September, 1928 (Vol. II) to December, 1929 (Vol. IV) and was 
then combined with the index of school building costs, being discontinued with the March, 1930 





issue (Vol. V). 


35. Harold F. Clark, "Index of School-Bond Prices." 
Journal from January, 1928 (Vol. 76) to November, 1951 (Vol. 83). 


Appeared monthly in the American School Board 





56. Harold F. Clark, This series began as "School Building Cost Index" in American Education Digest, 


48 (December, 1928), 28. 





Continued as "School Building Index" in School Executives Magazine 





from January to August, 1929 (Vol. 48); continued with Oscar K. Buros, joint author, as "Index 
of School Building Prices", in the same magazine, September to December, 1929 (Vol. 49); then 
combined with the index of school supply prices in the Nation's Schools from January to March, 


1930 (Vol. V). 
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as 






school buildings. 40 


also been worked on by others 97 the most in the distribution of state school 
outstanding study being that by Burgess£8 money .*4 


An index number for the cost of equipping In addition to such special in- 
new buildings was prepared by Loomis.59 dex numbers, use has been made of the 
Davis employed a sort of index number to general economic index numbers.* For 
study the increasing cost of operating example, index numbers of the cost of 


living have been used to show variations 
Several cost-of-living index num-| in the purchasing power of teachers' 


bers have been constructed especially for salaries.46 They have also been used 
teachers,*! to reflect variations in costs rather widely to account in part for the 
from year to year or from place to place. increasing costs of education since 1900; 
The report of the Committee on the Eco- in fact, almost half of the increase in 
nomic Status of the Teacher#*reviews four expenditures for education from 1900 to 
earlier index numbers (those of McKay and | 1930 has been attributed to the decreased 
Warne, Boothe, Eells, and Butsch), and purchasing power of the dollar during 
then constructs a new one, for the years 
1928-34. Harry” made a study of the cost | son that is possible with economic index 
of living of teachers in different parts numbers is that between the increase in 
of New York. This study was repeated in expenditures for education and the in- 
Ohio, and the index number was included creases in industrial production and 


this period.47 Another type of compari- 


one of the factors recommended for use | business activity, as shown by certain 
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can School Board Journal, 69 (October, 1924), 64; 71 (September, 1924), 65; and 85 (August, 
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. Randolph W. Burgess, Trends of School Costs. New York: Russell Sage Foundation, 1920. P. 142. 





Chapter V traces building costs from 1841 to 1920, in terms of index numbers. 

Also given in part in the following: "Eighty-year Fluctuations in the Cost of American 
School Buildings", American School Board Journal, 62 (January, 1921) 57-8; also in Proceedings 
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(July, 1926), 53. 
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(February 8, 1933), 42-45. 
"The Teacher's Economic Position", Research Bulletin of the National Education Association, XIII 
(September, 1935), Chapter VII, pp. 222-42. See also Circular No. 1, January, 1933, of the Edu- 
cational Research Service of the National Education Association, on "Estimating Changes in 
Teachers' Cost of Living." 

David P. Harry, Cost of Living of Teachers in the State of New York. Contribution to Education, 
No. 520. New York: Teachers College, Columbia University, 1928. P. 184. 

Equalizing Educational Opportunity in Ohio: a preliminary report of a survey of state and local 


support of public schools in Ohio, prepared under the direction of Paul R. Mort. The Ohio School 
Survey Commission, Columbus, Ohio, November 1, 1952. See pp. 39-40 and 147-49. 

For a description of various economic index numbers which are available, and their sources, see 
Barr, Good, and Scates, op. cit., (footnote 1, on page 265), pp. 445-48. 
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index numbers. 

A second use that has been made 
of index numbers in education has been 
for appraisal. This group of index num- 
bers is largely, though not in all cases 
entirely, divorced from direct connec- 
tion with money. Most ambitious of these 
applications have been the attempts to 
rate the educational activities of the 
various states. Reference has already 
been made to the index number prepared by 
Ayres, which has been widely discussed 
among school men. Phillips extended 
Ayres' index and modified it. Schrammel 
prepared a third index number for the 
states In addition to these state in- 
dex numbers, there have been at least 
three others prepared for cities and four 
for counties. This whole field has been 
excellently reviewed and summarized by 
the Research Division of the National Ed- 
ucation Association, and a bibliography 
of 47 references compiled. The same pub- 
lication contains data for the states on 
five factors selected by the Research Di- 
vision, which, however, refrains from 
combining these into a single index be- 
cause of not desiring to decide on the 
relative weighting of each factor. An in 
dex number for higher education in the 
various states, based on eight factors, 
has been prepared by Chamberlain and 
Meece.>”© Private and public education are 





considered both separately and jointly. 

Other studies may be found by 
consulting the topics, "Index Numbers’, 
and "Cost and Standard of Living", in 
the Education Index. Lundberg lists 
three index numbers that have been pre- 
pared in the field of Sociology to re- 
flect quantitative changes in condi- 
tions. 

Perhaps it is appropriate in 
closing this paper to refer to another 
use of index numbers which, so far as 
the writer is aware, has not yet been 
made. This use is for the purpose of 
combining detailed judgments, or ratings, 
such as are made when a score card is 
employed. Score cards represent an 
elaborate form of rating scale in which 
(typically) a large number of aspects of 
an object to be rated are listed for 
separate attention, each being allotted 
a maximum number of points which may be 
granted. These points, as awarded in 
rating, are added to arrive at a "score" 
for any object. The score is interpret- 
ed either through comparison with the 
score for other objects rated in the 
Same study, or by comparison with cer- 
tain standard values. It would be readi- 
ly possible to modify this technique 
slightly so that an index number would 
result. This would require expressing 
ratings for each aspect or element as 





48. Ayres, op. cit., (footnotel0, page 267). 
Frank M. Phillips, "Educational Rank of the States, 1930", American School Board Journal, Vol. 


84, February through May. 
D. C.) Earlier articles: 





(Also available as a forty-page reprint, from the author, Washington, 
"Educational Rank of the States, 1924", American School Board Journal, 





72 (April, 1926), 47, 141; and "Educational Ranking of the States by Two Methods", American 


School Board Journal, 69 (December, 1924), 47-49. 





This early series is available as a thirty- 


two-page publication for the Bruce Publishing Co., Milwaukee, Wisconsin (1925). 


Henry E. Schrammel, The Organization of State Departments of Education. 


"Ranking of States Ac- 





cording to Educational Achievements", Chapter IX, pp. 115-34. Bureau of Educational Research 


Monographs, No. 6, Columbus, Ohio: 


Ohio State University Press, 1926. 


- "Estimating State School Efficiency", Research Bulletin of the National Education Association, X 





(May, 1932), No. 3, 104-112. 


See also, for some additional references: 
New York: 


107. Contributions to Education, 242. 
P. 142. 





Frank L. Shaw, State School Reports, pp. 103- 
Teachers College, Columbia University, 1926, 





Leo M. Chamberlain and L. E. Meece, State Performance in Higher Education, Bulletin of the 





Bureau of School Service, V (March, 1955), No. 3. P. 


Kentucky. 
George A. Lundberg, Social Research. 
Appendix C, p. 362. 





New York: 


57. Lexington, Ky: The University of 


Longmans, Green and Co., 1929. P. 380. See 
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percentage variations from a standard 
condition or quality, in lieu of grant- 
ing so many points. If thought desira- 
ble, limits could be assigned to the max 
imum variation to be used for each ele- 
ment, such limits having the effect of 
weights. Again, if desirable, certain 
factor weights could be assigned to the 
series of elements, to be effective in 
combining (averaging) the per cents giv- 
en to these elements when rated. 

The result would be an index num 
ber representing a weighted composite of 
judgments on detailed aspects of the ob- 
ject, and varying from 100% as normal. 

It would have certain advantages over the 
common practice in preparing and using 
score cards. If the per cents were not 
restricted, the new procedure would be 
much more flexible than the old, allow- 
ing the rater a wider range of "scores" 
on each element, thus providing more ade- 
quately for extreme variations. These 
could later be "toned down" or amplified, 
as appropriate, by the final weights as- 
Signed to each element. This plan would 
permit rating above and below normal, in 
place of always rating down from an ideal 
standard. The use of per cent permits a 
common unit of expression for all of the 
items, instead of the variable scale pre- 
sented in the typical score card by dif- 
ferent numbers of points allowed as max- 
ima for the various items. The per 
cents, being uniform, might result in 
somewhat greater accuracy in expressing 
the rater's judgment. In both cases, 
printed suggestions as to standard condi- 
tions and as to allowances for various 
other described conditions, are in order. 
A disadvantage of the suggested technique 
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is an increased amount of calculation, 
requiring multiplying if weights are 
used, and in any case, requiring the ad- 
dition of larger numbers. It would be 
subject to the difficulties of interpre- 
tation previously discussed, in connec- 
tion with value, but in this respect it 
does not suffer more than the ordinary 
score card; the difference is that score 
cards have not been subjected to as crit- 
ical scrutiny. 

In summary, the index number 
technique affords a well developed, care- 
fully examined, and versatile procedure 
for combining weighted elements into a 
composite variable. While it owes its 
origin to the field of Economics, it is 
perfectly general in its application, 
being less restricted in a number of 
ways than is multiple regression. It has 
been used in Education to reflect varia- 
tions in the cost of supplies and build- 
ings, and to indicate different levels 
of merit. Certain points that should re- 
ceive particular attention in the use of 
index numbers are: the choice of ele- 
the weighting of these elements, 
their specific definition and units of 
measurement, the sampling of the source- 
field, the form of the index number used, 
and the interpretation of the results. 
There are other applications of index 
numbers that can be made, and it would 
appear that the technique should receive 
a wider recognition and use than has 
been given it in the past. 
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One of the great handicaps to the | 
scientific study of teaching is the com- 
plexity of the activity, the numerous | 
conditioning factors which need to be 
kept in mind, and the extreme rapidity 
with which the action takes place. The 
complexity of the activity and the rapid | 
ity with which it changes is such that it 
is physically impossible to observe ac- 
surately the things that happen in teach- | 
ing and make a reliable evaluation of | 
them. It has been apparent for some time 
that some means of obtaining more com- 
plete records of the teaching act was es- 
sential to more accurate studies of teach- 
ing. The more complete and objective the 
record the more significant the analysis 
that can be made from it. 

Various data-gathering devices 
have been used at various times in at- 
tempts to build up adequate records of 
the classroom activities. Among the ear- 
liest of these was a running account of 
tne class period jotted down in rough ab- 





breviated form as the class work proceed- 
ed. While this was far superior to mere 
memory of what had happened, it had many 
serious shortcomings. 
many important factors can not be record 
ed by this method due to the slowness of 
writing and to the fact that the observer 
can concentrate upon only one or two as- 
pects of the lesson at a time. Second, 
since the activities recorded represent 
but a small fraction of the total class 
activities, the record is always incom- 
plete and there is a question as to 


| 


In the first place, | 


| made more objective and reliable, 


whether the facts recorded are 
greatest significance. 
place, the data recorded from observa 
tion to observation vary so widely in 
content and form that it is impossible 
to compare one class exercise with ar 
other. Fourth, the records are almost 
always evaluative, presenting infer 
ences rather than facts (the ordinary 
record is an interpretation of what the 
observer sees and not an objective rec- 
ord of what happens) and finally, the 
records are made in terms of verbal sym- 
bols, and verbal descriptions, no mat 
ter how good, are highly personal 
subjective. 

The development of the non 
evaluative activity check list, which 
has in a way superceded the running ac- 
count method, made possible a more com- 
plete, objective, and comparable 
Such check lists provide a method of re- 
cording the important happenings of the 
class period without any attempt to 
evaluate them. Although the check list 
doubtless represents progress in the 
difficult task of studying the teacher 
at work, it too offers a very insuffi- 
cient record of the happenings of the 
classroom. While check lists may be 


a" , 
in tne tnird 


and 


record. 


they 
may be quite incomplete, many details of 
setting, gestural and even verbal ex- 
pression being lost in such reports. 

A third method employed in re- 
cording the happenings of a classroom 
is the stenographic report. This 





l. A. S. Barr, An Introduction to the Scientific Study of Classroom Supervision, (New York: 





D. Appletonand Company, 1931) pp. 190-234. 
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Figure 1. Sound Recording Equipment Assembled for Transportation. 





2. For a discussion of the use of stenographic records see, Maxie N. Woodring, "The Use of the 


Stenographic Lesson in Improving Instruction", Teachers College Record, XXXVII, No. 6, p. 504. 
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recitation it would consist of a visual- 
auditory report such as one might secure 
only from a sound motion picture. 

As one looks to the future one 
sees that the desirable record of the fu 
ture is to be that of the sound motion 
picture. The rapid technological de- 
velopment in this field indicates that 
even now such records are entirely pos- 
sible for those that have the time, 
money and equipment to make then. Such 
records will place before the student of 
education, in permanent form the impor- 
tant facts of teaching. They can be re- 
produced as often as desired, to be stud- 
ied and analyzed one factor at a time. 
It would be hard to overemphasize the im- 
portance of such records in the scientif- 
ic study of teaching. 

As a first step in the produc- 
tion of such records of teaching, exper- 
imental work has been carried on now for 
some time at the University of Wisconsin 
in the field of sound recording. This 
experimentation appears to have reached 
a stage where satisfactory sound records 
can be made of ordinary class work. The 
equipment used in the making of such 


records and some of the problems involved 


are described in this article. 

In the first place it should prob- 
ably be pointed out that there are numer- 
ous methods of producing sound records. 
Of these many methods it seems that only 
two may be practically used under pres- 
ent conditions for classroom recording. 
These methods are the photographic sound 


on film record, and instantaneous record- | 


ing on disc. Each method has its advan- 
tages and disadvantages. One of the very 
best means of recording sound is sound on 
film. It is possible by this means to 
secure recordings of very high quality. 
It is also possible to make long uninter- 
rupted recordings up to over an hour in 
length, which is a decided advantage in 
Classroom recording. This system is not 


| delicate, and difficult to 


however without its disadvantages. 
the film must be developed before it can 
be played back. Immediate reproduction 
is impossible. Second, recording on 
film is a complicated process and too 
technical for the ordinary lay worker. 
Third, the film records are expensive, 
handle. Proc- 
esses are now being developed however 
that may overcome certain of 
culties. 

At present, while instantaneous 
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| sound recording on dise is not without 
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disadvantages are that there is a 





| utes. 
| speed the amount of recording is approx- 





certain limitations, it appears to have 
certain advantages over sound on film 
recording for ordinary classroom use. 
First, there is no intermediate process 
between the recording and the reproduc- 
tion. The record may be played back in- 
mediately. Second, the method is simple 
enough so that good results may be ob- 
tained without the services of a trained 
technician. Third, the equipment is 
less expensive than that required for 
other methods of recording. Its chief 
cer- 
tain amount of needle scratch which 
sound on film avoids and that the amount 
of recording which can be placed on one 
side of a record is very limited as com- 
pared to the amount on sound film. One 
Side of a twelve-inch record at the 
standard turntable speed of 78 R.P.M. 
will make a recording of about five min- 
By changing to the 33 1/3 R.P.M. 


imately doubled but the quality at the 
reduced speed is not so high. nether 
true at the present time or not, it was 
felt at the time experimentation was 
started at the University of Wisconsin, 
that the instantaneous recording on disc 
method offered the greatest possibili- 
ties in the field of classroom recording. 
Equipment of this type was therefore 
purchased. The various units of this 
equipment are shown in Figure 2. 





3. Prof. M. L. Hanley, University of Wisconsin in Materials tor Research, to be issued by the Joint 
Committee on Materials for Research of the Social Science Research Council and American Council 
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ence, to be published by the Cambridge University Press. 
Karl Windesheim, Practical Apparatus for Sound Recording in the Speech Classroom, Labo- 
ratory, and Clinic, (Ph. D. Thesis, University of Wisconsin, 1934) pp. 100-119. 
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Figure 2. The Units 
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of Equipment Employed in Making Sound Records. 


mechanically inscribes the sound waves 
into a suitable surface. 
5. A record blank, of 
for receiving and retaining the 
undulations cut by the stylus, and alsc 
suitable for actuating a reproducing 
stylus. Aluminum blanks have been found 
quite satisfactory.® 
6. A turntable 
the record blank 
speed. For continuous 
ment with 
advantages. 
7. A phono pick-up which furnish 
means of transforming the sound 
forms cut upon the record back again 
to their equivalent electrical energy. 
8. Audio-frequency amplifier, 
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Wiring Diagram of Sound Recording Instrument. 
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Figure 3. 
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to make certain that it will reach a con- | 


venient room where the rest of the equip- | 
ment can be set up. 

In recording, the recording unit 
and the amplifier are set up side by side. 
The operator, by means of earphones lis- 
tens to what is happening in the class- 
room. By listening with the earphones 
and by watching the needle of the volume 
indicator the operator can soon learn to 
operate his controls so as to get the 
best recording possible. 

Experiments in actual classroom 
situations shows that the biggest prob- 
lem involved is that of overcoming acous- 
tical distortion. When one contrasts t 
ordinary classroom with its hard floors, 
bare walls and hard finished ceiling with 
the sound recording studio with its vari- 
ous sound insulating devices to give the 


exact acoustical properties needed, itis 


not surprising that this is true. When 
one also considers that in studios, with 
the best acoustical conditions, the 
speakers are generally within a few feet 
of the microphone while in classroom re- 
cording the pupils are seated in all 
parts of the room, it is easy to under- 


stand that there are many difficulties to | 


be overcome. 

One source of this acoustical 
distortion is the reverberation of the 
room or of objects in the room. Distor- 
tion is produced because the different 
frequency components in the sound may ex- 
perience unequal absorption by the room 
surfaces. 
or certain of them, may diffract or in- 
terfere in varying amounts with the dif- 
ferent frequencies thus causing distor- 
tion. The acoustic conditions may be 
such that some frequencies will be exces- 
Sively damped while others will be over- 
emphasized; there may be reverberation, 
overlapping, echoes and other extraneous 
noise. 

In addition to this type of dis- 
tortion from reverberation there is wnat 


is known as the "hangover" effect due to the | 





In addition the room surfaces, | 


persistence of sound after its source |! 
been silenced. This phenomenon has a 
tendency to blur the sound, to make the 
beginnings and endings of speech sounds 
different and to cause vowel sounds to 
mask succeeding consonants.” 

Experience has shown that other 
factors remaining constant, the greater 
the distance of the source of sound 
from the aneropeens the Greater the dis- 
tortion. 
can be done in classroom recording is 
to use several microphones so placed in 
the room as to bring every child to 
within six or eight feet of one of them. 
The audio-frequency amplifier must pro- 
vide a channel and fader for each micro- 
phone used. Many of the acoustical dif- 
ficulties can in a measure be overcome 
by the proper placing of the microphones 
in relation to the sources of sound and 
the acoustical conditions of the room. 
Sound absorbing materials such as rugs, 
bulletin boards, drapes, etc., help to 
reduce acoustical distortion. Open win- 
dows will also help very materially in 
this respect. 

In concluding this discussion 
attention should be called egain to some 
of the uses of sound recording instru- 
ments in education.§ 

1. Sound-recording instruments 
may be employed in making objective rec- 
ords of classroom instruction for pur- 
poses of research. 

2. Sound-recording instruments 
may be employed in the improvement of 
teachers in service. By the use of 
such an instrument teachers are enabled 
to listen to records of their own teach- 
ing and thus get a better picture of 
their teaching activities. These rec- 
ords can also be used in the analysis 
and improvement of teaching. 

3. Sound-recording instruments 
may be used in the institutional train- 
ing of teachers. Libraries of records 
may be developed to illustrate various 
types of classroom procedure. Such 





7. Alexander Wood, Sound Waves and Their Uses, (London: 
125. 
8. For a discussion of this topic see, Karl Windeshein, 





Blackie and Son Limited, 1930), pp. 112- 


op. cit., pp. 230-248. 
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their own difficulties in speech, oral 
reading, music, 


etc. 

6. Sound-recording instruments 
may be used to make and give standard- 
ized achievement tests. With sound 
tests recorded on disc such 
test directions, time allowed for each 
item of the test, etc., can be made 
practically uniform. 


factors as 








A STUDY OF LATERALITY TEST 
by 
Catharine J. 


ITEMS 


Hull 


Speech Clinic, University of Minnesota 
inneapolis 


In the last few years, laterali- 
ty has assumed a significant role bothin 
the experimental laboratory and in clin- 
ical remedial work. It has been related 
not only to observable peripheral activ- 
ity, but its confusion has been consid- 
ered symptomatic of certain dysfunctions 
of the central nervous system. Moreover, 
recent work in aphasia has demonstrated 
that speech has a unilateral localiza- 
tion in the cerebral cortex, and that 
this localization is correlated directly 
with hand preference.1 

Since unilateral motor activi- 
ties have been widely accepted as periph- 
eral manifestations of this cerebral one- 
sidedness,* it is desirable to obtain a 
test of the side preference in these one- 
sided activities. As it is usually im- 
possible to obtain an actual performance 
of each one of these activities, the best 
solution of the problem is to secure the 
information from a questionnaire. If this 
obtained information is to be used as a 
basis of research and clinical recommen- 
dations, it is imperative that it be ob- 


tained from a questionnaire which is sig-| 


nificantly valid and reliable. Does the 
subject actually perform the unilateral 
activities as he indicates on the ques- 
tionnaire? Can his written answer be ac- 
cepted as a reliable one? This study was 
initiated in an attempt to answer those 
questions, and to discover which items 
might warrant inclusion in such 4 ques- 
tionnaire. 


Procedure 
Two questionnaires and two 
formance tests were given. 


per- 
The ques- 











tionnaire was composed of 40 items. Some 
of the items had been used in previous 
unstandardized questionnaires, and others 
related to ordinary one-sided activities 
were included. An attempt was made to 
include only those activities with which 
the average person was familiar, so spec- 
ulation could be reduced to a minimun. 
In the second questionnaire, the identi- 
cal items were included, but were ar- 
ranged in an entirely different order. 
Simple directions were printed at the 
top. of the list of questions, and were 
called to the attention of those taking 
the test. A minimum of four weeks was 
allowed to elapse between the adminis- 
tration of the first questionnaire and 
the first performance test. The same 
period of time was allowed before the 
next succeeding test. A copy of the 
questionnaire is given on the following 
page. 

All of the performance tests 
were given by the experimenter and two 
assistants, both of whom had been trained 
in their administration. Definite spoken 
directions were given before each act was 
performed, and the subjects were in- 
structed not to start until the instruc- 
tion was completed. The articles were 
arranged on a large laboratory table in 
positions which favored neither hand. 
Each subject proceeded around the table 
in clockwise fashion, facing the table. 
This was done because the majority of 
the subjects were right-sided, and in 
such a manner, any advantage from bodily 
position was afforded to the left side. 
An attempt was made to reduce the subjec- 
tive element to a minimum by giving only 





1. Weisenberg, Theodore, and MacBride, Katharine, "Aphasia", Commonwealth Fund, New York, 1955, 
pp. 435, 451-52. 


2. Travis, Lee Edward, "Speech Pathology", Appleton Co., New York, 1931, p. 59. 
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Name Age Sex 





Grade in School Present Date 








This is a test to determine which side you use in manual activities. 

In the following questions, encircle the letter R if you perform the cer- 
tain activity with the right hand, L if you perform it with the left hand, and the 
letter E if you can perform it easily with either hand. In all of these activi- 
ties, consider your hands empty when you begin to perform then. 


- With which foot do you kick a4 ball? 
. When you cross your legs, which one is on top? 
When hopping on one foot, on which foot do you put your weight? 
. Which hand holds a hammer while hammering? 
Which hand uses a can opener? 
Which hand holds the scissors (shears) while cutting? 
. Which eye remains open when you sight with one eye through a small 
hole in a piece of paper? 
. Which hand distributes cards when dealing them? 
Which hand holds the handkerchief when you blow your nose? 
Which hand waves goodbye? 
. Which hand spins a top? 
Which hand strikes a match? 
Which hand winds a watch? 
Which hand holds a toothbrush? 
- Which hand takes money from a purse? 
Which hand holds the knife in sharpening a pencil? 
- Wnich hand directs the thread through the eye of a needle? 
- Which hand holds the spoon when stirring in a bowl? 
- Which hand holds the comb when you comb your hair? 
- Which hand turns the pages in a book? 
Which hand takes the cork from a bottle of ink? 
With which hand do you write? 
- With which hand do you use an eraser on paper? 
Which hand cuts with the knife when eating? 
Which hand uses a salt shaker? 
- With which hand do you bounce a rubber ball on the floor? 
Which hand is on top when you applaud? 
With which hand do you draw a sketch or picture? 
- Which hand turns the water faucet when you hold no glass in either 
hand? 
With which hand do you pick up a penny from the floor? 
Which hand uses an eraser on the blackboard? 
- Which hand throws a ball? 
Which hand is at the top of the handle when you sweep the floor with 
a straight broom? 
Which hand holds a tennis racquet? 
- Which hand is at the top of the handle when you rake? 
From which shoulder do you swing a bat? 
Which hand is at the top end of the handle when you shovel? 
Which hand pushes the light switch on the wall? 
Which hand puts the key in the door keyhole? 
Which hand turns the knob in opening a door? 
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the direction for action, and asking no | 


further questions. For the second per- 
formance test, which was given no less 
than a month after the second question- 
naire, the articles were arranged as be- 


| 

| (TABLE SHOWING % CASES WITH SAME AN- 

| 
fore, but in a different order, cor- | Item Til, QiTi QiQs 

| 

| 

| 


SWER IN 2 TESTS) 








responding to that of the items in the Number (50 (220 (160 
second written form. cases) cases) cases) 
The subjects used were unselect- 
ed members of the beginning speech class-| 
es at the University of Minnesota, and 
members of the English classes of the 
University High School on the same can- 
pus. In the first consideration of the 
results, the two groups were tabulated 
separately. However, the distributions 
on individual items were so similar that | 94 92.18 93.75 
all of the subjects were considered as 68 55.45 63.12 
one group. Since the age range was not 92 73.18 76.25 
great, and as the test was not dependent | 96 94.09 93.12 
upon education or intelligence, it was | 94 83.63 86.25 
believed that such a combination was per- 96 91.81 91.25 
missible. The entire group contained 92.72 90.62 
practically an equal number of men and 86 65.45 68.12 
women. A total of 220 subjects took the 96 98.18 93.75 
first questionnaire and the first per- 17 86 76.36 78.75 
formance test. 160 subjects filled out | 98 89.54 88.12 
both questionnaires, and 50 subjects were 19 92 86.36 86.25 
given the second performance test as a 20 68 55.45 65.00 
check on the reliability of the first ac- 21 88 76.36 81.25 
tual performance of the activities. 22 99.09 97.50 
23 98 88.63 85.61 
Results 24 98 95.90 96.87 
In considering the data, actual 25 92 79.09 78.12 
contingency correlations were inadvisa- 26 82 | 70.90 76.87 
ble because the extremely heavy weight- 27 84 72.72 73.75 
ing of subjects in the right-handed group 28 95.90 96.25 
skewed the distributions. Since the only 29 | 88 | 50.90 62.50 
information desired was whether or not 30 82 | 53.63 65.00 
the acts were performed with the same 31 86 | 59.09 68.12 
hand in the two tests which were being 32 90 92.27 91.25 
compared, the method of percentages was 33 84 | 68.65 | 78.75 
believed adequate. Table I gives the 34 | 98 | 96.36 96.25 
percentages thus obtained. 35 80 | 72.72 | 76.87 
From the percentages, it can be 36 88 | 89.09 91.25 
seen that certain items are consistently 37 74 | 79.54 84.37 
high in all three categories, while others 38 68 | 37.27 70.62 
rank consistently low. With the excep- 39 | 92 | 85.90 84.37 





90 92.72 86.25 
72 47.72 61.87 
88 50.45 70.00 
94 95.90 97.50 
98 80.90 95.00 
98 97.27 94.37 
88 57.27 62.50 
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tion of batting, the bimanual activities 40 74 | 47.72 66.87 
of sweeping, batting, raking, and shovel- L 

ing, in which the supposed lead hand has Legend 
been considered symptomatic of sidedness, | T; - lst performance. Q, - lst questionnaire. 
showed a low relationship between the Tg - 2nd performance. Qz2 - 2nd questionnaire. 
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comparison of the 
actual 


written answer and the 
performance. In batting, 89.09% 


of the subjects performed the act as they 


had indicated. The percentage for sweep- 
ing was 68.6%, for shoveling 79.54, and 
for raking 72.72. It seems, therefore, 
that the questionnaire answer is not a 
reliable criterion of the performance of 
these latter three activities. 
Limitations of Study 

As in any random sampling of stu- 
dents, there were relatively few left- 
sided and ambidextrous subjects included 
in the study. If a more uniform number 
students could be obtained in each 
handedness group, a skewed distribution 
would be avoided, and correlations of 
contingency could be used for a more re- 


of 


fined measure of relationship between the | 


tests. Even with such a sampling, how- 
ever, a correction would have to be made, 
as the correlation would include only a 
three-way table. 

The objective method of adminis- 
tering the performance test may have low- 
ered the reliability by the acceptance 
of the side first choosing to perform the 
act, without the consideration of the | 
possibility that the other side might also 
be able to carry out the activity. It is 
believed, however, that this influence 
was small, as students who could use 
either hand with equal efficiency admit- 
ted that fact without solicitation from 
the examiner in their attempts to carry 
out the directions. The information de- 
sired was not whether or not either side 
was able to perform the act, but wnich 
side did perform that act easily in the 
majority of cases. 

Some one-sided acts quite fre- 
quently performed by the average individ- | 
ual may have been omitted from the initial | 
list, but it is believed that tne 40 | 
items provided an adequate sampling for 
the testing. 


Summary | 
1. A performance test of 40 items given | 
twice to 50 students yielded 21 items | 
in which the students performed the | 
activity with the same side in over 
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90% of the ceases. This test-retest 
indicated a high enough reliability 
of performance to warrant acceptance 
of the tests as validation of the 
same items on the questionnaire. 

On duplicate questionnaires adminis- 
tered to 160 students, 14 of the re- 
liable performance items were an- 
swered the same over 90% of the time. 
This showed a high reliability of 
those items on the written question- 
naire. 

In comparing the answers on the first 
questionnaire and the first perfornm- 
ance test, 14 items were answered 
identically over 90% of the time. 
This proved that those items were 
valid ones for use in a question- 
naire, using the actual performance 
as a validating measure. 

Of the items which were proved sig- 
nificantly reliable on test-retest of 
performance and questionnaire, and which 
were indicated as valid by compari- 
son of the written answer and the 
actual performance, 12 items were an- 
swered identically in over 90% of 
the cases in all three categories. 
This high degree of relationship per- 
mits use of the following items in a 
sidedness questionnaire. 


Which hand holds a 
mering? 

Which hand holds the scissors while 
cutting? 

Which hand distributes cards while 
dealing them? 

Which hand spins a top? 

Which hand winds a watch? 

Which hand holds a toothbrush? 

Which hand holds the knife in sharp- 
ening a pencil? 


hammer while ham- 


- With which hand do you write? 

- Which hand cuts with the knife when 
eating? 

With which hand do you draw a sketch 
or picture? 

- Which hand throws a ball? 

Which hand holds a tennis racquet? 


TEACHING AND EDUCATIONAL 
by 
7 
Ibert Mellan 


Philadelphia, Pennsy 


Among the most recent of educa- 

problems, and perhaps the most 
widely discussed, is that classroom pro- 
‘edure known as instruction by mechani- 
cal devices. Since the World War active 
steps have been taken to equip hundreds 
of classrooms with appropriate apparatus 
to demonstrate the part vision plays in 
the pursuance of efficient teaching 
methods. 

It is interesting to find that 
the attention given to visual instruc- 
tion goes farther back in educational 
history than would be otherwise supposed 
by this only recent attention to its pos- 
sibilities. 

The writer has been fortunate in 
uncovering a vast wealth of material hid- 
den and filed away in the United States 
Patent Office which may throw much light 
in the way of educational study. The 
following lists of patents are, for the 
most part, concerned with teaching and 
educational devices, appliances, appara- 
tus, etc. The earliest record in the 
United States Patent Office is the con- 
tribution of H. Chard, dated February 16, 
1809, who gives us a "Mode of Teaching to 
Read." §. Randall, dated October 1, 1810, 
and January 11, 1812, offered a patent 
entitled "Mode of Teaching to Write." 

Although not of the earliest con 
tributors to this patent literature, the 
distinguished name of Dr. Maria Montes- 
sori will be found among them. Her pat- 
ent, No. 1103369 (1914), offered an edu- 
cational device for properly training 
the sense of touch, which, in her opin- 
ion, is an essential factor in the prop- 
er development of the child. 

Curiously, foliowing these first 
two pioneers, patent literature on edu- 
cation and teaching was noticeably infre- 
quent until 1870, after which the more 
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these patents is nec 
schoolmen. This is a 
opportunity for educators, 
experience and years of 
to develop an abundance of adequate 
teaching material from this vast amount 
of literature which contributes to every 
type of instruction and to every subject 
in the curriculum. 
of these patents are 
cause they may have been outmoded in 
face of constant advancement in educa- 
tion and teaching methods; however, here 
is the premise of the educator who must 
decide between the useful and the unfit 
material. 

Copies of these patent specifi- 
cations may be obtained from the Commis- 
Sioner of Patents, Washington, D.C. 
cost is ten cents a patent. In 
ing give patent number and tit 
invention. 

With the following 
key and the patent number an approximate 
time when the patent was issued can be 
determined. 
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Patent Number 


Date When Issued 


















1 


1790 
32000 


1820 














60000 


1840 
110000 


1860 












230000 


1880 
443000 


1890 
















532000 
667000 









1895 
1900 






817000 


1905 
980000 


1910 
















1138000 


1915 
1364000 


1920 












1568000 


1925 
1737000 


1930 


















1892000 1932 


PATENT RELATING TO THE TEACHING OF ARITHMETIC 





Title of Invention 





Patent Number 








Teaching Arithmetic 4632 (1846) 
n fn 


149235 












Means of Teaching Fractions 151971 
Device for Teaching the Metric System 176735 
Apparatus for Teaching Arithmetic 
Device for Teaching Involution and Evolution 





196583 
209385 


214822 
215916 






Device for Teaching Arithmetic 
fn n n n 







Apparatus for " ® 234247 
Device 7 8 ° 262191 











Apparatus " "° ° 


264572 
Educational Appliance 296018 










Means of Teaching Fractions 


342651 
Device for Teaching Fractions 356167 





Device for Teaching Fractions and Percentage 383300 
Apparatus to Facilitate the Teaching of Notations and Numerations 384959 







Device to Aid in Teaching Arithmetic 


389415 
Apparatus for Teaching Arithmetic 


390824 






Device for Teaching Numbers to Children 


416593 
. ° - Arithmetic 


431102 
















Device for Teaching Combinations of Numbers 


452302 
Teaching Arithmetical Calculations 462376 






Apparatus for Teaching Arithmetic, etc. 502184 
n n n n n 


5883871 
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Title of Invention 





Device for Teaching Computations 
Education Device for Teaching Sperica 


Device for Teaching the Fundamental Operation with Numbers 
Device of Facilitating Teaching of Fraction 
Apparatus of Teaching and Learning Arithmetic 


Apparatus of Teaching and Learning Arithmetic 
Device for Teaching Fractional Value 


Device for Teaching Fractional 
n . " Numbers 


Device for Teaching Arithmetic 
Appliance for Teaching Arithmetic 


Device for Teaching Arithmetic 
Appliance for Teaching Arithmetic 


Appliance for Teaching Arithmetic 
Device Used in Teaching Geometry and Trigonometry 


Device for Teaching Numbers in Combination, Analysis Factors and 
Multiple 
Appliance for Teaching Arithmetic 


Device for Teaching Arithmetic 
Apparatus for Teaching Arithmetic 


Device for Teaching Division 
380532; 704979; 708568; 1248238; 666999 


CHEMISTRY 


Apparatus for Teaching Chemistry 
Appliance for Teaching Chemistry 


DRAWING 


Device for Teaching Drawing 
Cards for Teaching Drawing 


Device for Teaching Drawing 
Apparatus for Teaching Drawing 


Educational Art Text Sheet 
Charts for Teaching the Reading of Drawings 


Device for Teaching Drawing 


GEOGRAPHY 


Apparatus for Teaching Geography and Astrography 
Teaching Geography 


Educational Device for the Illustration of Longitude and Time 
Educational Globe 
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Patent Number 





604953 
629891 


812408 
6816204 


841158 
846484 


856068 
1043652 


1098330 
1151279 


1129890 
1174689 


1211625 
1218931 


1405010 
1541179 
1594396 
1662503 


1728584 
1730418 


1818566 


242821 
480275 


171268 
282659 


471442 
651791 


720187 
1049241 


1617207 


2426 (1842) 
143934 


526629 
418455 





Title of Invention 
PENMANSHIP-WRITING 


of Teaching to Write (S. Randall) F 
fn n n n n u 


2% A tT. 
i1iea v&@n. 
49 5 
Li1€ 


\a+ 


/ 


Plummets of Lead in Teaching Writing (T. Weston 
Diagram for Teaching Penmanship 


Sopy-slips for Teaching Penmanship 
n fn 


»pies " 


yages for Teaching Penmanship 
Device " ° - 


Device for Teaching Penmanship 
jand-guide for Use in Teaching Penmanship 


ippliance for Teaching Penmanship 
n n n n 


Device for Teaching Penmanship 
n 


" 


Teaching Penmanship 
n fn 
Teaching Penmanship 
n n 
Teaching Penmanship 
n n 


Teaching Penmanship 
fn n 


Teaching Penmanship 
n n 


Teaching Penmanship 
fn n 


Device for Teaching Penmanship 
Apparatus for Teaching Penmanship 
PHYSICAL 


Apparatus for Teaching the Art of Swimming 
" " " Swimming 


Apparatus for Teaching the Art of Boxing 
" e ” Diaphragmatic Breathing 


Apparatus for Teaching 
n n nr 


Apparatus for Teaching 


Swimming 
Children to Walk 


Swimming 


Teaching the Golf-swing 


Mechanical Figure for Teaching Golf 
Self-Teaching Walking and Dancing 


149249 


oncaas 
eVU00Ia 


sOC ANA 
42<09/6 
nue . 
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peer 
VO00/5 


658464 


946886 


rn =a a3) 
f 48532 


1703403 


1815443 
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Title of Invention 


READING 
Mode of Teaching to Read (H. Chard) Filed Feb. 16, 1809 
Apparatus for Teaching ween and Identifying Objects 
Means for Teaching Speaking and Reading 
Table of Movable Characters for the Teaching and Practicing 
Appliance to be Used in Teaching Reading 


Means for Teaching Reading 
Device to Facilitate the Teaching of Sight-Reading 


Teaching Reading 
Teaching Reading and the Like 


Apparatus for Teaching Reading 


SPELLING 
Apparatus for Teaching Spelling 
fn n n n 
Apparatus for Teaching Word-Analysis 
Kindergarten Apparatus of Teaching Spelling 


Kindergarten Apparatus of Te ing Spelling 
Game 


Kindergarten Apparatus of Teaching Spelling 
Apparatus for Teaching Spelling 


Teaching Spelling 
® " for Kindergarten 


Means for Teaching ot age and the Like 
n n the Alphabet 


Means for Aiding in Spelling and Phonics 


TYPEWRITING AND BOOKKEEPING 


Teaching Business Practice 
Machine for Teaching Touch Typewriting 


Device for Teaching Touch Typewriting 
® Typewriting 


Means for Teaching Shorthand 


Apparatus for Teaching Bookkeeping 
694944; 898114; 665991; 678618 


TELEGRAPHY 


Machine for Teaching Telegraphy 
Teaching and Practice of Telegraphy 


Volume IV, No. 


Z 
c 


Patent Number 


of Reading 


822927 
1010782 


1224742 
1263626 


1278425 
1584627 


1514270 
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There was published recently a d = difference in months between 
papert presenting a technique for build- | month of given date and birth 
ing tables which would simplify the cal- | month, 
sulation of chronological ages as of a 
given date. According to the method | then 
iescribed, it would be necessary to build 
new table for each date for which the 
ages are to be calculated. In the case 
here the ages are desired accurate to 
ie nearest month, this becomes a rather course, is 
engthy process, and the present method | age is less than G 
s suggested in its stead. In order to eliminate all nega- 
A single table has been con- | tive values, and thus simplify the work, 
structed which can be applied for any | formula (1) can be rewritten in the f 
given date, with certain corrections to | 
be described below. If we let Age i(G - 
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onological age 
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mately by the formula 


rm: 
1) - B) years + 1 (2) 


G = year of given date, where T = d months + 1 year. The values 
B = year of birth, of T are given in Table I 
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- Buros, Oscar K. "A Simple Technique for the Calculation of Chronological Ages", Journal of Ed- 
ucational Research, XXVI (Jenuary, 1953), pp. 360-363. 
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As it stands, formula (2) does | pose the birth dates of four individual: 
° | 


not give age correct to the nearest month} to be (a) March 25, 1919, (b) July 12, 






in all cases. If, for example, the giv- 1917, (c) December 2, 1922, and (d) No- 
en date is May 2, 1933 and the birth date | vember 30, 1914. It is desired to find 
is April 27, 1920, the age by formula be-| their ages on August 8, 193. 






nearest month is 13-0. Since each month from it each value of B in turn. Here, 


comes 13-1, while the age correct to the | (1) Determine G - 1 and subtract 
4 
| 
is assumed to contain 30 days, a correc- G-12= 1931. (The results of steps (1) 

















tion must be made for a difference of | to (4) are summarized in Table II.) 
more than fifteen days between given date | (2) Find the values of T in the 
and birth date. The corrected formula | appropriate column and rows of Table I. 
follows: | In the example, all the values are from 
| column 8 (Aug.) since August is the 
Age = [(G - 1) - B]yrs. + T +kmos.(3) | month of the given date. T for indi- 
| vidual (a) is 1-5, and is read, "one 
where k = 0 with two exceptions: | year, five months." 
(3) Note the days of the month 
(1) If given date minus birth date > fif-| for which k is different from zero. Here, 
teen days, k = +l. k = -l for every birth date > 24. For 
(2) If birth date minus given date > fif-| all other birth dates k = 0. 
teen days, k = -l. (4) Add the value of k in 
months to the sum of the results of 
Only one of the above inequalities can steps (1) and (2). 


arise for any given date. 





Outline 





of Procedure 










A concrete example will be taken 
to illustrate the use of Table I. Sup- -- 





TABLE II 


SUMMARY OF THE STEPS IN CALCULATING THE CHRONOLOGICAL 
AGES OF FOUR INDIVIDUALS AS OF August 8, 1932 
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Birth Date 


(1) (2) | (8) (4) 


ce 




















a March 25, 1919 1-5 
b July 12, 1917 | 14 1-1 0 15-1 
c December 2, 1922 9 0-8 0 9-8 
d November 30, 1914 0-9 











